Search not working for multiple words #127

Closed
opened 2024-05-15 12:32:22 +02:00 by Dalai Felinto · 2 comments

I think it is search for each word individually now?

Example: https://extensions.blender.org/search/?q=Blender+ID

I expected to find: https://extensions.blender.org/add-ons/blender-id-authentication/

I think it is search for each word individually now? Example: https://extensions.blender.org/search/?q=Blender+ID I expected to find: https://extensions.blender.org/add-ons/blender-id-authentication/
Pablo Vazquez added the
Type
Report
label 2024-05-15 16:47:28 +02:00
Owner

Our current code looks for a substring match, resulting in partial word matches, which makes the results noisy for short tokens like id.
The extension you expected to find is listed among other results, but this makes search not really useful.

I couldn't find a simple way to implement a whole-word match that would work the same way in both postgresql and sqlite.
And anyway, the current implementation of search is too naive and inefficient to be called production-ready.

We should look into testing full text search in postgresql on our use cases, maybe that would be sufficient: https://docs.djangoproject.com/en/4.2/ref/contrib/postgres/search/

Our current code looks for a substring match, resulting in partial word matches, which makes the results noisy for short tokens like `id`. The extension you expected to find is listed among other results, but this makes search not really useful. I couldn't find a simple way to implement a whole-word match that would work the same way in both postgresql and sqlite. And anyway, the current implementation of search is too naive and inefficient to be called production-ready. We should look into testing full text search in postgresql on our use cases, maybe that would be sufficient: https://docs.djangoproject.com/en/4.2/ref/contrib/postgres/search/
Pablo Vazquez added the
Reviewed
Confirmed
label 2024-05-27 12:56:46 +02:00
Oleg-Komarov self-assigned this 2024-05-31 15:53:19 +02:00
Owner

I've deployed #162, this should improve things a little bit, but it should be considered a stopgap solution.

If we need to improve search, we should start collecting requirements and use cases in a form similar to this report: query + expected result set.

From access logs we can lookup what people are looking for, in the last month we've been getting approx 150-200 requests to /search endpoint per day, so it is possible to identify popular search terms by hand of with a simple query, and then check if we actually find the things that should be found.

Closing this issue for now.

I've deployed #162, this should improve things a little bit, but it should be considered a stopgap solution. If we need to improve search, we should start collecting requirements and use cases in a form similar to this report: query + expected result set. From access logs we can lookup what people are looking for, in the last month we've been getting approx 150-200 requests to /search endpoint per day, so it is possible to identify popular search terms by hand of with a simple query, and then check if we actually find the things that should be found. Closing this issue for now.
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: infrastructure/extensions-website#127
No description provided.