Help Yaali design a phoneme inventory database.
Anthologica Universe Atlas / Forums / Terra Firma / Help Yaali design a phoneme inventory database.

? Yaali Annar The Gote
posts: 94
, Initiate Speaker message
Originally I'm thinking of making a phoneme inventory generator.
In order to do that, I need to figure out the commonality of each possible phonemes.
In order to do that, I need to catalogue phonemes of human languages.

I know this has been done before in UPSID, but I'm adding several new fields on my data. The following are the columns I have figured out so far for my table

- Language, self explanatory. This column will be filled with  ISO 639-3 code of the language.
- Phoneme, self explanatory. IPA Unicode will be used to fill this column. I think four character is enough... or is it?
- Nativity, whether the phoneme are native or not. For example /f/ and /z/ are observed in Indonesian, but it's not native.
- Underlying, the underlying representation of the phoneme. For example, many South Asian languages are listed with /n̪ t̪ d̪/ without alveolars to contrast them. The underlying representation of these phonemes would be /n t d/.

Any other column I need to consider in designing the table?
? Yaali Annar The Gote
posts: 94
, Initiate Speaker message
Another problem that I'm going to encounter is determining what series are considered "phoneme" in a language. This is usually a problem in vowels department.

(If anyone can help me with Mandarin and English vowels, that would be awesome)
? Rhetorica Your Writing System Sucks
posts: 1292
, Kelatetía: Dis, Major Belt 1
message
How about actual frequency in a dictionary or sample text? That seems like a central concern.
? KathTheDragon Beware the Dragon
posts: 92
, Baroness, United Kingdom
message
Something to consider with English is what dialect you're considering, especially with vowels.
? dhok posts: 235
, Alkali Metal, Norman, United States
message
Use upper-crust New England; it's the only correct choice:

/i ɪ ʊ u eɪ ɛ oʊ ɔ ɑ æ a ʌ ɜ/
? Nessari ?????? ?????? ????????
posts: 932
, Illúbequía, Seattle, Cascadia
message
/slap
? Yaali Annar The Gote
posts: 94
, Initiate Speaker message
quoting Rhetorica, Illúbequía: Dis, Major Belt 1:
How about actual frequency in a dictionary or sample text? That seems like a central concern.

It's a good idea, I'll add a column for that.

But, as of now, those column will have default value of 0. it takes major effort to getting the required corpus and then analysing them.

Something to consider with English is what dialect you're considering, especially with vowels.

Maybe middle english then? Shit's easy. It only has /i i: u u: e e: o o: ɛː ɔː a a:/  Just like a normal sane language.

Here's an abomination I came up so far:

PALM - ɑ
LOT - ɒ
CLOTH - ɒ
BATH - ɑː
TRAP - æ
PRICE - ai
START - ar
MOUTH - au
THOUGHT - o
DRESS - e
COMMA - ə
FACE - ei
SQUARE - er
LETTER - ər
KIT - i
HAPPY - i
FLEECE - i:
NEAR - ir
CHOICE - oi
NORTH - or
GOAT - ou
FORCE - our
FOOT - u
GOOSE - u:
CURE - ur
STRUT - ʌ
NURSE - ʌr
? Nessari ?????? ?????? ????????
posts: 932
, Illúbequía, Seattle, Cascadia
message
Which Middle English? The vowels in the Anglo-Saxon portions of Great Britain have been splintered since they invaded. Any clean simple sets you think you have are illusions.
? Uzhdarchios posts: 19
, Foreigner in Unknown Kadath
message
Phoible might be of some use here, and it conveniently incorporates all the UPSID and SPA data.