利用 Qwik 和 Express.js 构建 Weaviate HNSW 索引性能的可视化分析器


向量搜索的默认参数就像汽车的自动挡——可靠、易用,但无法提供赛道级的极致性能。当我们的应用场景从“能用”变为“极致快、极致准”时,深入向量数据库的索引内部,手动调整那些神秘的参数,就成了绕不开的挑战。问题在于,这些参数(例如HNSW索引中的efefConstruction)对性能的影响并非线性,调整它们的效果往往像是在进行炼金术,充满了不确定性。

这次我们要构建的不是一个简单的搜索应用,而是一个“炼金术”的实验台:一个能实时、可视化地探索Weaviate HNSW索引性能边界的分析器。我们将使用Express.js作为坚实可靠的后端,它将作为我们与Weaviate沟通的桥梁,并精准地测量每一次查询的性能。前端则选用Qwik,这简直是为此类场景量身打造的武器。它的“可恢复性”(Resumability)能带来近乎零延迟的交互体验,让用户在拖动性能参数滑块时,能即时感受到后端纳秒级的性能变化。

我们的核心任务是构建一个API端点,它不仅接受搜索词,还允许客户端动态传入HNSW的ef(search-time effort)参数,然后将查询耗时与结果一并返回。这彻底改变了传统的黑盒调优方式。

// file: server/src/api/search.js

import { Router } from 'express';
import { getWeaviateClient } from '../services/weaviate.js';
import { performance } from 'perf_hooks';
import logger from '../utils/logger.js';

const router = Router();
const CLASS_NAME = 'Article'; // Weaviate schema class name

/**
 * Middleware to measure the duration of Weaviate query execution.
 * It attaches the duration to the response object for the final handler to use.
 */
const measureQueryTime = async (req, res, next) => {
    const start = performance.now();
    // We attach a function to the response object, so the main handler can trigger the measurement.
    // This is more precise than measuring the entire request lifecycle.
    res.measure = async (queryFn) => {
        try {
            const result = await queryFn();
            const end = performance.now();
            res.locals.queryDuration = parseFloat((end - start).toFixed(2));
            return result;
        } catch (error) {
            // Ensure we still call next() even if the query fails.
            next(error);
        }
    };
    next();
};

router.get('/', measureQueryTime, async (req, res, next) => {
    const {
        query,
        limit = 10,
        ef // The magic parameter from the frontend!
    } = req.query;

    if (!query) {
        return res.status(400).json({ error: 'Query parameter is required.' });
    }

    const client = getWeaviateClient();
    const efValue = ef ? parseInt(ef, 10) : -1; // -1 tells Weaviate to use its default

    try {
        const queryBuilder = client.graphql
            .get()
            .withClassName(CLASS_NAME)
            .withNearText({ concepts: [query] })
            .withLimit(parseInt(limit, 10))
            .withFields('title content _additional { id distance }');

        // Dynamically add search-time HNSW config if ef is provided
        if (efValue > 0) {
            queryBuilder.withHnsw({ ef: efValue });
        }
        
        // Execute the query via our measurement wrapper
        const weaviateResponse = await res.measure(() => queryBuilder.do());

        const results = weaviateResponse.data.Get[CLASS_NAME];

        res.json({
            results,
            queryDurationMs: res.locals.queryDuration,
            ef: efValue > 0 ? efValue : 'default',
        });

    } catch (err) {
        logger.error({
            message: 'Weaviate search failed',
            error: err.message,
            stack: err.stack,
            query,
        });
        // Forward the error to the centralized error handler
        next(err);
    }
});

export default router;

这段Express代码是整个系统的核心。注意measureQueryTime中间件的设计,它没有粗暴地计算整个请求的耗时,而是通过在res对象上附加一个measure函数,让路由处理器可以精准包裹queryBuilder.do()这个异步调用。这种方式能剔除网络延迟、请求解析等噪音,让我们得到纯粹的数据库查询时间。更重要的是,它能根据前端传入的ef参数,动态构建withHnsw({ ef: efValue })查询,这就是我们实验台的控制旋钮。

第一步:奠定基石 - Weaviate Schema与数据注入

没有数据和索引配置,一切都是空谈。在真实项目中,索引的构建时参数(如efConstructionmaxConnections)和搜索时参数(ef)同等重要。前者决定了图的质量和构建速度,后者决定了搜索时的精度和延迟。我们的目标是固定构建时参数,探索搜索时参数的影响。

首先,定义Weaviate Schema。这里的关键在于vectorIndexConfig部分,我们在这里为HNSW索引设定构建时的性能基线。

// file: server/src/services/weaviate.js

import weaviate from 'weaviate-ts-client';
import dotenv from 'dotenv';

dotenv.config();

const clientConfig = {
    scheme: process.env.WEAVIATE_SCHEME || 'http',
    host: process.env.WEAVIATE_HOST || 'localhost:8080',
};

// Singleton pattern for the Weaviate client
let weaviateClient = null;

export const getWeaviateClient = () => {
    if (!weaviateClient) {
        weaviateClient = weaviate.client(clientConfig);
    }
    return weaviateClient;
};

export const setupSchema = async () => {
    const client = getWeaviateClient();
    const className = 'Article';

    const existingClasses = await client.schema.getter().do();
    if (existingClasses.classes.some(c => c.class === className)) {
        console.log(`Class "${className}" already exists. Skipping schema creation.`);
        return;
    }

    const schemaConfig = {
        'class': className,
        'description': 'A piece of text content',
        'vectorizer': 'text2vec-openai', // Or your choice of vectorizer module
        'moduleConfig': {
            'text2vec-openai': {
                'model': 'ada',
                'modelVersion': '002',
                'type': 'text'
            }
        },
        'properties': [
            {
                'name': 'title',
                'dataType': ['string'],
                'description': 'The title of the article',
            },
            {
                'name': 'content',
                'dataType': ['text'],
                'description': 'The main content of the article',
            }
        ],
        // The most critical part for our experiment
        'vectorIndexConfig': {
            'skip': false,
            'cleanupIntervalSeconds': 300,
            'maxConnections': 32,      // Connections per node in the graph. Higher means more memory/build time, better recall.
            'efConstruction': 256,     // Build-time effort. Higher means a better quality graph, longer build time.
            'ef': -1,                  // Search-time effort. -1 uses default. We will override this at query time.
            'dynamicEfMin': 100,
            'dynamicEfMax': 500,
            'dynamicEfFactor': 8,
            'vectorCacheMaxObjects': 1000000,
            'flatSearchCutoff': 40000,
            'distance': 'cosine',      // Distance metric
        }
    };

    await client.schema.classCreator().withClass(schemaConfig).do();
    console.log(`Schema for class "${className}" created successfully.`);
};

vectorIndexConfig中,maxConnectionsefConstruction是我们为索引质量设定的“赌注”。较高的值意味着更精密的索引图,但构建时间和内存消耗也更大。我们选择了一组相对较高的值(32/256),以确保我们的索引基座足够扎实,从而让ef参数的调整效果更加明显。

接下来是数据注入。一个好的实验需要有意义的数据。我们用一个简单的脚本来批量导入数据,为我们的分析器提供弹药。

// file: server/scripts/seed.js

import { getWeaviateClient, setupSchema } from '../src/services/weaviate.js';
import logger from '../src/utils/logger.js';

// A sample dataset. In a real project, this would come from a file or database.
const sampleData = [
    { title: "Qwik Resumability", content: "Qwik's resumability allows applications to start instantly by pausing execution in the server and resuming it on the client, avoiding costly hydration." },
    { title: "Weaviate HNSW Index", content: "Weaviate uses the Hierarchical Navigable Small World (HNSW) algorithm for efficient approximate nearest neighbor search in vector spaces." },
    { title: "Express.js Middleware", content: "Middleware functions in Express.js are functions that have access to the request object, the response object, and the next middleware function in the application’s request-response cycle." },
    // ... add at least 1000 more diverse items for a meaningful test
];

async function seedDatabase() {
    try {
        await setupSchema();
        const client = getWeaviateClient();
        const batcher = client.batch.objectsBatcher();
        let counter = 0;

        for (const item of sampleData) {
            batcher.withObject({
                class: 'Article',
                properties: {
                    title: item.title,
                    content: item.content,
                },
            });

            counter++;
            // Flush the batch periodically to avoid memory issues and send data to Weaviate
            if (counter % 100 === 0) {
                const res = await batcher.do();
                logger.info(`Batch ${counter / 100} imported. Errors: ${res.filter(r => r.result.errors).length}`);
            }
        }

        // Flush any remaining objects
        if (batcher.objects.length > 0) {
            await batcher.do();
        }

        logger.info('Database seeding completed.');
    } catch (err) {
        logger.error({
            message: 'Seeding failed',
            error: err.message,
            stack: err.stack,
        });
        process.exit(1);
    }
}

seedDatabase();

第二步:构建交互界面 - Qwik的瞬时魔法

后端API已经准备就绪,现在轮到前端了。如果我们的UI因为框架自身的瓶颈(如水合作用)而产生延迟,那么我们测量到的毫秒级后端差异就毫无意义。这正是Qwik大放异彩的地方。我们需要一个输入框、一个用于调整ef的滑块,以及一个结果展示区。

graph TD
    A[Qwik UI] -- 1. User types query and adjusts 'ef' slider --> B{Component State};
    B -- 2. State change triggers `useResource$` --> C[Express.js API /api/search];
    C -- 3. Request with `query` and `ef` params --> D[Weaviate Client];
    D -- 4. Constructs and sends GraphQL query with HNSW `ef` override --> E[Weaviate Engine];
    E -- 5. Performs search and returns results --> D;
    D -- 6. Client receives results --> C;
    C -- 7. Express middleware wraps query, measures duration, and sends JSON response {results, duration, ef} --> A;
    A -- 8. `useResource$` receives data, updates state, and Qwik instantly patches the DOM --> F[Visual Feedback];

这是我们系统的完整流程。Qwik的useResource$是实现这一流程的关键,它专门用于处理异步数据获取,并能优雅地管理加载、解析和错误状态。

// file: qwik-app/src/routes/index.tsx

import { component$, useStore, $, useResource$ } from '@builder.io/qwik';

interface SearchResultItem {
  title: string;
  content: string;
  _additional: {
    id: string;
    distance: number;
  };
}

interface ApiResponse {
  results: SearchResultItem[];
  queryDurationMs: number;
  ef: number | 'default';
}

interface SearchStore {
  query: string;
  ef: number;
  triggerSearch: number; // A simple counter to trigger refetch
  apiResponse: ApiResponse | null;
}

export default component$(() => {
  const store = useStore<SearchStore>({
    query: 'understanding modern web frameworks',
    ef: 64, // Initial search-time effort
    triggerSearch: 0,
    apiResponse: null,
  });

  const searchResource = useResource$<ApiResponse>(({ track, cleanup }) => {
    // This resource re-runs whenever `store.triggerSearch` changes.
    track(() => store.triggerSearch);

    // This is essential for debouncing or cancelling previous requests.
    const abortController = new AbortController();
    cleanup(() => abortController.abort());

    if (store.triggerSearch === 0 || !store.query) {
      return { results: [], queryDurationMs: 0, ef: 'default' };
    }

    const params = new URLSearchParams({
      query: store.query,
      ef: store.ef.toString(),
      limit: '15',
    });

    // In a real app, the API URL should be from an environment variable.
    const promise = fetch(`http://localhost:3001/api/search?${params.toString()}`, {
      signal: abortController.signal,
    }).then((res) => res.json() as Promise<ApiResponse>);
    
    return promise;
  });
  
  // We use QRL ($) to create a lightweight, serializable function.
  // This is the core of Qwik's performance magic.
  const handleSearch = $(() => {
    store.triggerSearch++;
  });

  return (
    <div class="container">
      <h1>Weaviate HNSW Performance Analyzer</h1>
      
      <div class="search-controls">
        <input
          type="text"
          value={store.query}
          onInput$={(e) => store.query = (e.target as HTMLInputElement).value}
          placeholder="Enter semantic query..."
        />
        <button onClick$={handleSearch}>Search</button>
      </div>

      <div class="slider-controls">
        <label for="ef-slider">Search-Time Effort (ef): {store.ef}</label>
        <input
          id="ef-slider"
          type="range"
          min="16"
          max="512"
          step="16"
          value={store.ef}
          onInput$={(e) => store.ef = parseInt((e.target as HTMLInputElement).value, 10)}
        />
        <p>Higher `ef` increases accuracy (recall) but also increases query latency.</p>
      </div>

      <div>
        {/* `useResource$` provides a convenient way to render based on promise state */}
        <Resource
          value={searchResource}
          onPending={() => <p>Loading...</p>}
          onRejected={(error) => <p>Error: {error.message}</p>}
          onResolved={(data) => (
            <>
              <div class="stats">
                <span>Query Duration: <strong>{data.queryDurationMs} ms</strong></span>
                <span>HNSW `ef` Used: <strong>{data.ef}</strong></span>
              </div>
              <ul class="results-list">
                {data.results.map((item) => (
                  <li key={item._additional.id}>
                    <h3>{item.title}</h3>
                    <p>{item.content.substring(0, 200)}...</p>
                    <small>Distance: {item._additional.distance.toFixed(4)}</small>
                  </li>
                ))}
              </ul>
            </>
          )}
        />
      </div>
    </div>
  );
});

这份Qwik代码最酷的地方在于它的声明式数据获取和状态管理。当用户拖动滑块时,store.ef会更新。当用户点击“Search”按钮时,我们仅仅是递增store.triggerSearchuseResource$通过track函数监听这个变化,并自动重新发起API请求。整个过程没有手动的DOM操作,没有复杂的生命周期管理。Qwik负责以最高效的方式将状态变更同步到视图,这就是交互如此流畅的秘密。用户拖动滑块后点击搜索,几乎能瞬时看到查询耗时(queryDurationMs)的变化,这为我们提供了一个完美的性能反馈闭环。

实验与观察

将前后端运行起来后,真正的乐趣开始了。

  1. ef 值 (e.g., 16-32):

    • 输入查询 “the philosophy of scalable systems”。
    • 观察: 查询耗时极低,可能在10-20ms。返回的结果可能相关,但未必是最佳匹配。可能会出现一些主题略有偏差的文章。这是速度优先的策略。
  2. ef 值 (e.g., 64-128):

    • 保持查询不变,将滑块拖到中间区域。
    • 观察: 查询耗时上升,可能在30-60ms。结果的质量显著提高,返回的文章与“可扩展系统”和“哲学”这两个概念的关联度更高。这是大多数生产环境寻求的平衡点。
  3. ef 值 (e.g., 256-512):

    • 将滑块拖到右侧极限。
    • 观察: 查询耗时大幅增加,可能达到100ms以上。返回的结果非常精准,几乎都是主题最相关的文章。但为了这最后一点精度的提升,我们付出了数倍的延迟代价。在某些对精度要求极高的场景(如法律文件检索)下,这种牺牲是值得的。

这个工具的价值在于,它将抽象的“性能与精度的权衡”变成了用户可以亲手操作和感知的具体数字。

当前方案的局限性与未来迭代

这个分析器本身已经是一个强大的工具,但它并非完美。首先,我们只探索了ef这一个搜索时参数。一个更完整的分析器应该允许调整其他参数,例如在混合搜索中alpha参数的权重。

其次,我们衡量的只是延迟,而没有量化“召回率”(Recall)。这是一个更复杂的指标,需要一个带有“标准答案”的评测数据集。未来的迭代可以引入一个评测模式:针对一个预设的问题集,运行不同ef值的搜索,并将结果与标准答案对比,从而计算出召回率,最终绘制出一条完整的“延迟-召回率”曲线。

最后,当前后端是无状态的。在真实的高并发系统中,为防止重复计算,可以在Express层增加一个基于查询和ef值的缓存层(例如使用Redis)。对于那些热门查询,缓存能极大地降低Weaviate的负载,但这也会让我们的性能分析变得复杂,需要在测量时能够选择性地绕过缓存。


  目录