Skip to content

Add example to discover non-English sites#44

Merged
handecelikkanat merged 2 commits into
mainfrom
discovery-of-non-english-sites
Mar 20, 2026
Merged

Add example to discover non-English sites#44
handecelikkanat merged 2 commits into
mainfrom
discovery-of-non-english-sites

Conversation

@sebastian-nagel

@sebastian-nagel sebastian-nagel commented Mar 19, 2026

Copy link
Copy Markdown
Contributor

TODO:

  • test the query on Athena
  • pages with no or unknown ("unk") language are counted as LOTE, but maybe that's not a big problem.

@sebastian-nagel sebastian-nagel force-pushed the discovery-of-non-english-sites branch from 21179c9 to e631c07 Compare March 19, 2026 13:58
@sebastian-nagel sebastian-nagel marked this pull request as ready for review March 19, 2026 14:00
- exclude pages with unknown language from LOTE count

@handecelikkanat handecelikkanat left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, very interesting edge-case and solution, thank you for providing the edge-case solution as well @sebastian-nagel !

@handecelikkanat handecelikkanat merged commit d0e554c into main Mar 20, 2026
7 checks passed
@handecelikkanat handecelikkanat deleted the discovery-of-non-english-sites branch March 20, 2026 12:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants