happiness thread
Anthologica Universe Atlas / Forums / Miscellaneria / happiness thread

? Jipí der saz ûf eime steine
posts: 291
, Transition Metal, Marburg, Germany
message
Congrats in this case
? kodé man of few words
posts: 110
, Deacon, this fucking hole we call LA
message
Thanks, guys! Morrigan, that "almost" is really important...
? Rhetorica Your Writing System Sucks
posts: 1279
, Kelatetía: Dis, Major Belt 1
message
Aww, jeez, you have to remind me of that thing, huh. Blargh!

But seriously: congrats!
? Morrígan Witch Queen of New York
posts: 303
, Marquise message
A bunch of my friends just got their PhDs. ;_;

My research is coming along though. Bit by bit. And i'm working on the SCA manual and examples writing a manual is a lot of work.
? Hallow XIII Primordial Crab
posts: 525
, 巴塞尔之侯
message
hello i'm a bachelor's student
? Morrígan Witch Queen of New York
posts: 303
, Marquise message
I worked SO much this weekend, it was great. I made a test-standard data set for Ingush-Chechen-Batsbi, did a lot of refactoring and modularization of my alignment code, as well as setting up the Jenetics library so I can use genetic algorithms to tune my model parameters.

The results so far are fantastic but i have a lot of work still ahead. One rather interesting result so far is using a gap penalty (at least, a constant one) is actually bad:

CGYlN-fXIAAspFa.jpg

Admittedly, I have not yet tried this with my Indo-Iranian data. I had been using a penalty of 6, which actually was a very bad choice.
? Rhetorica Your Writing System Sucks
posts: 1279
, Kelatetía: Dis, Major Belt 1
message
ggplot for life!
? Pthagnar Benedictine Ovulation
posts: 209
, Quaestor Foraminis Aspirationis
message
wait what
what indo-aryan things are you doing bioinformatics to
? Morrígan Witch Queen of New York
posts: 303
, Marquise message
Everything basically. I need to prepare some more data...

http://dialawizard.tumblr.com/post/120528229284/dialawizard-related-to-the-other-data-this-is

also this, where the Indo-Iranian proves to be tricky with the current configuration
http://dialawizard.tumblr.com/post/120495109859/ive-made-some-important-progress-in-laying-the
? Morrígan Witch Queen of New York
posts: 303
, Marquise message
So, the latest good news is that I implemented a few new gap penalty functions and so far it looks like the Indo-Iranian data is behaving better with a non-negative gap (using a convex gap function). Still, building more sample data will be a big help.
? Rhetorica Your Writing System Sucks
posts: 1279
, Kelatetía: Dis, Major Belt 1
message
...I think I know what she's doing, and I'm not completely certain I like it.

Do you realize you'll need to build a whole new model for every language family?
? Morrígan Witch Queen of New York
posts: 303
, Marquise message
There are concerns about that. But mostly I'm interesting in using simulated data to examine how things like inherited vocabulary size and overlap between children effect information recoverability.

Though I have suspicions that when it come to actual reconstruction, unsupervised machine learning might be viable.
? Rhetorica Your Writing System Sucks
posts: 1279
, Kelatetía: Dis, Major Belt 1
message
Yyyeah, you should've started with that mentality, I think. Sequence alignment is my speciality, and the functions optimized by the standard approaches are grossly incorrect from a biological standpoint; they just have momentum because they're verifiable and objective. The thought of trying to bring that into a linguistics setting makes me squeamish.
? Morrígan Witch Queen of New York
posts: 303
, Marquise message
Oh interesting.

sequence alignment isn't the interesting problem here though, for the most part that's probably going to be inferring proto-forms and rules from correspondences. Which I'm able to get fairly handily. Without training on my Chechen-Ingush-Batsbi data, the system was able to pick out correspondence which I know to be correct. What will be interesting is seeing if it can use a (probably statistical) model to infer reasonable ancestor forms, and identify conditioning environments.
? Jipí der saz ûf eime steine
posts: 291
, Transition Metal, Marburg, Germany
message
So … you may have explained it before, but what are those charts depicting? I understand that historical linguistics has become increasingly influenced and informed by genetics, epidemiology and population biology over the past decades, but that still doesn't give me a clue what those charts you posted mean.
? Morrígan Witch Queen of New York
posts: 303
, Marquise message
They depict the way that model parameters impact the performance of the algorithm vs some human-aligned data. Basically all of that involves sequence alignment stuff, and weight coefficients for my feature model.

I was thinking I could make a thread about this over in Terra Firma
? Hallow XIII Primordial Crab
posts: 525
, 巴塞尔之侯
message
Relatedly how come neither the Linguistics or (!) the Computational Linguistics BA program here felt the need to include statistics

in fact how come there are any university degrees at all that allow people to get away with not taking math classes

and most importantly why can't I get points for doing it anyway like a reasonable person
? Morrígan Witch Queen of New York
posts: 303
, Marquise message
yeah, I don't get that. we DID have a quantitative methods course, but frankly it was terrible, the professor sucked (he failed to get tenure, ultimately) and the book sucked. Keith Johnson, I think, the orange one that does everything in R and explains nothing.
? Hallow XIII Primordial Crab
posts: 525
, 巴塞尔之侯
message
Yeah, same here. I mean, okay, it's vaguely more forgivable in traditional linguistics, but if I study CL and I don't have to take statistics what does that say about the value of my degree? (The regulations that prevent CS from being available in 90-point format are also terrible.)
? Rhetorica Your Writing System Sucks
posts: 1279
, Kelatetía: Dis, Major Belt 1
message
That's obscene. You're both fired.
notices