Accentuated characters in the search module

I am building a comprehensive website of all (or most) of my photography for the stage for the last 37 years. The site has a search function. All the keywords consist of names of people involved in the different productions. But, it’s in French. So, there are accentuated characters like “Bédard” or “Danièle Lévesque” or “Michel Crête”.
The names are written in the page content of every production album, in HTML and in caps.
However, the same names are written in normal size in the keywords section of Lightroom, then published to Backlight.
For some strange reason, the names containing accentuated characters will not return in the search module unless they are written in caps. All other names act normally and return correct results.
Why won’t the accentuated character names show up?
Compare:

I tried deleting the keywords in Lightroom for this album: https://pideja.ca/galleries/02_tnm/1989-1990/haha/
Then, I using the search module, did a search for one or the other names in the credits listed on the album’s page. All search results were correct! This would seem to indicate that the results come directly from the album’s Page content/Main copy section, rather than from any keyword list in Lightroom. The names in the credits are written in caps, but they will return even if the search box is written in normal case. Except for accentuated characters. These MUST be written in caps in the search box to return a correct result.

From the documentation:

Backlight’s search operates on keywords, file names, album titles and image captions. It matches albums containing the search terms, as well as individual images.

It does not look at page copy

Then, how to explain that an album without any keywords, but with the Page Content / Main Copy section filled in with the credit names will return correct answers when using the search module?

In this example, only the É is in caps:
https://pideja.ca/backlight/search/?q=%22Andr%C3%89+Naud%22
This returns 4 productions, all of which just have the Title, Theater and date in the Lightroom keywords. All other info, like the artist’s name (in this case, André Naud) is taken from the Page Content / Main Copy section of the album https://pideja.ca/backlight/publisher/edit_album/863856/list/#page-content

All I know is what the documents show and what Ben has confirmed in his replies

Who am I not to believe you and Ben, but for some reason, at least on this album titled “Ha ha!..” (https://pideja.ca/galleries/02_tnm/1989-1990/haha/) I have deleted all keywords from the list in Lightroom, then re-published, cleared the caches and then I did a search (“Jean-Marie Guay”) and among the results the show “Haha!..” turns up!
I then did a search for “DANIÈLE LÉVESQUE” and it turns up also. Notice that I used all caps because of the accented characters.
So I don’t know what is happening here but clearly, the search engine is finding data in the Main Copy of the page and not in the keywords.

Did you republish the album and clear browser cache prior to trying the search?

I have verified that ALL keywords have been deleted in Lightroom;
I then re-published from Lightroom to site;
I then refreshed the template and the browser’s caches;
I then re-checked the search, using “Jean-Marie Guay” as an example.

The result returned correctly that “Jean-Marie Guay” is on the “Haha!..” production. So this tells me that the search module uses the page copy in order to return the correct results to the query.
As for the accented characters I have put an instruction on the search page for the users to
employ all caps when searching for proper names. This way, users will be able to find the artists that have accented names and I will thus continue to use all caps for proper names.

So, for now, at least, I consider the issue resolved.

Thank you, Matthew, Ben and Rod for all your gracious help.

1 Like

Hi Pierre, is this the documentation you’ve been looking at? https://blog.theturninggate.net/2018/02/19/mastering-backlights-search/

We may need to update details of what is searched. The proof is in the code.

  1. Albums are searched by: title, description and page copy.
  2. Photos are searched by: title, caption, filename and keywords
  3. Albums are also returned if they include photos that matched in 2)

I haven’t yet had time to look into how accents are handled in search, and what we can do to improve it.

This is good to know about page copy being searchable.
How does it rank compared to image keywords, caption and filename?

My site is structured this way: Album Sets are production houses, Albums within these are seasons and photos have a serial number. At the moment, some of these photos have keywords (season, title, names of creators, like the set designer, lighting designer, director…, and finally production house. Lightroom sorts these in alphabetical order.
But most photos have no keywords at all!
I put all the information needed in the Page Content/Main Copy section of each album. See this example:
(https://pideja.ca/backlight/publisher/edit_album/637033/list/#page-content) )

Since this is sort of an illustrated catalogue of different set designs I documented for over thirty years, a typical use for it could be finding a play or a director or any other creative artist in relation to the stage plays illustrated within the site.
An example: Michel Beaulieu was a lighting director. A search for his name would return all productions associated with his name…
https://pideja.ca/backlight/search/?q=%22MICHEL+BEAULIEU%22
Selecting a title will display that show’s page with all the credits and photos.
If someone knew which photo it was looking for, that search will also yield a correct result: https://pideja.ca/backlight/search/?q=%22TNM900301_+LC%22
That’s the main goal of this site.

I’ve already explained about the accented characters. The funny thing about that is credits not containing accented characters will return correct results regardless of the query written in normal or caps. This, even if the credits are written in caps on the Main Copy page. But, accented characters must be in caps to return.

I built this intuitively since my experience with databases and search engines is limited at best. So, I hope I am clear in my explanations.