[virt-tools-list] [libosinfo 7/8] rfc: Infer ISO language from label
Zeeshan Ali (Khattak)
zeeshanak at gnome.org
Mon Dec 3 16:30:18 UTC 2012
On Mon, Dec 3, 2012 at 1:23 PM, Christophe Fergeau <cfergeau at redhat.com> wrote:
> Now that libosinfo has an osinfo_db_identify_media method which
> modifies the media it was passed, we can generate properties which
> needs information from the media stored in the OsinfoDB, and
> information from the actual media (ISO volume ID).
> This is useful to guess what languages are supported by a given
> Windows ISO: the end of the ISO volume ID has a language code, which
> we can translate to a locale identifier.
>
> This commit adds a lang-regex property to the OsinfoDB database to
> extract the language code from Windows ISO volume IDs, and
> then add mapping tables to turn it into a locale identifier.
> ---
> data/oses/windows.xml.in | 2 +
> data/schemas/libosinfo.rng | 5 ++
> osinfo/libosinfo.syms | 4 +-
> osinfo/osinfo_db.c | 177 +++++++++++++++++++++++++++++++++++++++++++++
> osinfo/osinfo_loader.c | 4 +-
> osinfo/osinfo_media.c | 67 ++++++++++++++++-
> osinfo/osinfo_media.h | 3 +
> 7 files changed, 258 insertions(+), 4 deletions(-)
>
> diff --git a/data/oses/windows.xml.in b/data/oses/windows.xml.in
> index d09e873..e8c29f9 100644
> --- a/data/oses/windows.xml.in
> +++ b/data/oses/windows.xml.in
> @@ -739,12 +739,14 @@
> <iso>
> <volume-id>(HB1_CCPA_X86FRE|HRM_CCSA_X86FRE|HRM_CCSA_X86CHK|HRM_CCSNA_X86CHK|HRM_CCSNA_X86FRE|HRM_CENA_X86FREV|HRM_CENA_X86CHKV|HRM_CENNA_X86FREV|HRM_CENNA_X86CHKV|HRM_CPRA_X86FREV|HRM_CPRNA_X86FREV)_</volume-id>
> <publisher-id>MICROSOFT CORPORATION</publisher-id>
> + <lang-regex>[[:upper:][:digit:]_]*_([[:upper:]]*-[[:upper:]]*)</lang-regex>
> </iso>
> </media>
> <media arch="x86_64">
> <iso>
> <volume-id>(HB1_CCPA_X64FRE|HRM_CCSA_X64FRE|HRM_CCSA_X64CHK|HRM_CCSNA_X64FRE|HRM_CCSNA_X64CHK|HRM_CENNA_X64FREV|HRM_CENNA_X64CHKV|HRM_CENA_X64FREV|HRM_CENA_X64CHKV|HRM_CPRA_X64FREV|HRM_CPRNA_X64FREV)_</volume-id>
> <publisher-id>MICROSOFT CORPORATION</publisher-id>
> + <lang-regex>[[:upper:][:digit:]_]*_([[:upper:]]*-[[:upper:]]*)</lang-regex>
> </iso>
> </media>
>
> diff --git a/data/schemas/libosinfo.rng b/data/schemas/libosinfo.rng
> index 87635dd..36fa1a1 100644
> --- a/data/schemas/libosinfo.rng
> +++ b/data/schemas/libosinfo.rng
> @@ -281,6 +281,11 @@
> <text/>
> </element>
> </optional>
> + <optional>
> + <element name='lang-regex'>
> + <text/>
> + </element>
> + </optional>
> </interleave>
> </element>
> </define>
> diff --git a/osinfo/libosinfo.syms b/osinfo/libosinfo.syms
> index d45e58e..7c3efe1 100644
> --- a/osinfo/libosinfo.syms
> +++ b/osinfo/libosinfo.syms
> @@ -341,11 +341,11 @@ LIBOSINFO_0.2.2 {
> osinfo_install_config_set_target_disk;
> osinfo_install_config_get_script_disk;
> osinfo_install_config_set_script_disk;
> -
> osinfo_install_script_get_avatar_format;
> osinfo_install_script_get_path_format;
> -
> osinfo_install_script_get_product_key_format;
> +
> + osinfo_media_get_languages;
> } LIBOSINFO_0.2.1;
>
> /* Symbols in next release...
> diff --git a/osinfo/osinfo_db.c b/osinfo/osinfo_db.c
> index 46101d6..2c2eb5a 100644
> --- a/osinfo/osinfo_db.c
> +++ b/osinfo/osinfo_db.c
> @@ -38,6 +38,177 @@ G_DEFINE_TYPE (OsinfoDb, osinfo_db, G_TYPE_OBJECT);
> (((str) != NULL) && \
> g_regex_match_simple((pattern), (str), 0, 0)))
>
> +static gchar *get_raw_lang(const char *volume_id, const gchar *regex_str)
> +{
> + GRegex *regex;
> + GMatchInfo *match;
> + gboolean matched;
> + gchar *raw_lang = NULL;
> +
> + regex = g_regex_new(regex_str, G_REGEX_ANCHORED,
> + G_REGEX_MATCH_ANCHORED, NULL);
> + if (regex == NULL)
> + return NULL;
> +
> + matched = g_regex_match(regex, volume_id, G_REGEX_MATCH_ANCHORED, &match);
> + if (!matched || !g_match_info_matches(match))
> + goto end;
> + raw_lang = g_match_info_fetch(match, 1);
> + if (raw_lang == NULL)
> + goto end;
> +
> +end:
> + g_match_info_unref(match);
> + g_regex_unref(regex);
> +
> + return raw_lang;
> +}
> +
> +struct LanguageMapping {
> + const char *iso_label_lang;
> + const char *gettext_lang;
> +};
> +
> +static GHashTable *init_win_lang_map(void)
> +{
> + GHashTable *lang_map;
> + const struct LanguageMapping lang_table[] = {
> + /* ISO label strings up to Windows 7 */
> + { "EN", "en_US" },
I agree with you that it would be nice to avoid this mapping all
together but if its not possible/feasible, these mappings should be in
the XML like rest of OS specific stuff. Then again, if we go that
route, I wonder if all this approach is really better than having
separate media entries for each combination of supported languages.
--
Regards,
Zeeshan Ali (Khattak)
FSF member#5124
More information about the virt-tools-list
mailing list