Matching Behavior
This page describes what happens to a license string before and during lookup.
Quote Normalization
Before lookup, LicenseLynx normalizes recognized Unicode quote characters to the ASCII single quote '.
This helps when license strings are copied from PDFs, websites, office documents, or other sources that use typographic quotes.
Examples:
BSD ‚Zero‛ ClausebecomesBSD 'Zero' ClauseLicensed under the “MIT” licensebecomesLicensed under the 'MIT' license
Lookup Order
Lookup happens in this order:
- Normalize quote characters.
- Search the stable map.
- If
riskyis enabled, search the risky map. - If
orgis provided, search the organization-specific map.
See Risky Mappings for the meaning of the risky map.
Recognized Quote Characters
All three libraries normalize the same set of characters:
quote_characters = [
# Single quotes
"\u2018", # LEFT SINGLE QUOTATION MARK '
"\u2019", # RIGHT SINGLE QUOTATION MARK '
"\u201A", # SINGLE LOW-9 QUOTATION MARK ‚
"\u201B", # SINGLE HIGH-REVERSED-9 QUOTATION MARK ‛
"\u2032", # PRIME (often used as an apostrophe) ′
"\uFF07", # FULLWIDTH APOSTROPHE '
# Double quotes
"\u201C", # LEFT DOUBLE QUOTATION MARK "
"\u201D", # RIGHT DOUBLE QUOTATION MARK "
"\u201E", # DOUBLE LOW-9 QUOTATION MARK „
"\u201F", # DOUBLE HIGH-REVERSED-9 QUOTATION MARK ‟
"\u2033", # DOUBLE PRIME ″
"\u00AB", # LEFT-POINTING DOUBLE ANGLE QUOTATION MARK «
"\u00BB", # RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK »
"\uFF02", # FULLWIDTH QUOTATION MARK "
]