Help Yaali design a phoneme inventory database.

Anthologica Universe Atlas / Forums / Terra Firma / Help Yaali design a phoneme inventory database.

Yaali Annar The Gote
posts: 94
, Initiate Speaker message

Originally I'm thinking of making a phoneme inventory generator.
In order to do that, I need to figure out the commonality of each possible phonemes.
In order to do that, I need to catalogue phonemes of human languages.

I know this has been done before in UPSID, but I'm adding several new fields on my data. The following are the columns I have figured out so far for my table

- Language, self explanatory. This column will be filled with ISO 639-3 code of the language.
- Phoneme, self explanatory. IPA Unicode will be used to fill this column. I think four character is enough... or is it?
- Nativity, whether the phoneme are native or not. For example /f/ and /z/ are observed in Indonesian, but it's not native.
- Underlying, the underlying representation of the phoneme. For example, many South Asian languages are listed with /n̪ t̪ d̪/ without alveolars to contrast them. The underlying representation of these phonemes would be /n t d/.

Any other column I need to consider in designing the table?

edited 2 times, last update 9 years ago link

Yaali Annar The Gote
posts: 94
, Initiate Speaker message

Another problem that I'm going to encounter is determining what series are considered "phoneme" in a language. This is usually a problem in vowels department.

(If anyone can help me with Mandarin and English vowels, that would be awesome)

9 years ago link

Rhetorica Your Writing System Sucks
posts: 1292
, Kelatetía: Dis, Major Belt 1
message

How about actual frequency in a dictionary or sample text? That seems like a central concern.

edited once, last update 9 years ago link

KathTheDragon Beware the Dragon
posts: 92
, Baroness, United Kingdom
message

Something to consider with English is what dialect you're considering, especially with vowels.

9 years ago link

dhok posts: 235
, Alkali Metal, Norman, United States
message

Use upper-crust New England; it's the only correct choice:

/i ɪ ʊ u eɪ ɛ oʊ ɔ ɑ æ a ʌ ɜ/

9 years ago link

Nessari ?????? ?????? ????????
posts: 932
, Illúbequía, Seattle, Cascadia
message

/slap

9 years ago link

Yaali Annar The Gote
posts: 94
, Initiate Speaker message

quoting Rhetorica, Illúbequía: Dis, Major Belt 1:
How about actual frequency in a dictionary or sample text? That seems like a central concern.

It's a good idea, I'll add a column for that.

But, as of now, those column will have default value of 0. it takes major effort to getting the required corpus and then analysing them.

Something to consider with English is what dialect you're considering, especially with vowels.

Maybe middle english then? Shit's easy. It only has /i i: u u: e e: o o: ɛː ɔː a a:/ Just like a normal sane language.

Here's an abomination I came up so far:

PALM - ɑ
LOT - ɒ
CLOTH - ɒ
BATH - ɑː
TRAP - æ
PRICE - ai
START - ar
MOUTH - au
THOUGHT - o
DRESS - e
COMMA - ə
FACE - ei
SQUARE - er
LETTER - ər
KIT - i
HAPPY - i
FLEECE - i:
NEAR - ir
CHOICE - oi
NORTH - or
GOAT - ou
FORCE - our
FOOT - u
GOOSE - u:
CURE - ur
STRUT - ʌ
NURSE - ʌr

edited once, last update 9 years ago link

Nessari ?????? ?????? ????????
posts: 932
, Illúbequía, Seattle, Cascadia
message

Which Middle English? The vowels in the Anglo-Saxon portions of Great Britain have been splintered since they invaded. Any clean simple sets you think you have are illusions.

9 years ago link

Uzhdarchios posts: 19
, Foreigner in Unknown Kadath
message

Phoible might be of some use here, and it conveniently incorporates all the UPSID and SPA data.

9 years ago link

return to Terra Firma