happiness thread
Anthologica Universe Atlas / Forums / Miscellaneria / happiness thread

previous 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 next end
? Jipí der saz ûf eime steine
posts: 291
, Transition Metal, Marburg, Germany
message
FIRST AS A TRAGEDY
THEN AS A FARCE

THEN AS SEQUENCE ALIGNMENT

vWA8lru.png
? Hallow XIII Primordial Crab
posts: 539
, 巴塞尔之侯
message
karl marx, vladimir ulyanov and lev levenshtein
? Morrígan Witch Queen of New York
posts: 303
, Marquise message
i'm afraid.


I'm just using sequence alignment as a first pass to get likely correspondences from cognate pairs (or more, but alignment in n dimensions might not yield any better results than doing all pairs and tree-building). So it seems pretty uncomplicated.
? Rhetorica Your Writing System Sucks
posts: 1292
, Kelatetía: Dis, Major Belt 1
message
That is definitely what I was afraid of. Unless you added scores for linguistically-relevant transformations like metathesis and rank misalignments according to how many steps of sound change are required to go from one to the other (and weight them according to frequency.) Did you do any of that, or is it basically just Levenshtein edit distance?

"Quick-and-dirty" approaches are a major problem in bioinformatics, mostly outside of Europe. You end up with tools and pipelines that work for only a small portion of the data in practice, and it's usually not the interesting parts. We live in the shadow of a seemingly-immortal dictator called BLAST, which tries to speed up alignments by finding n-mers of several letters common to two sequences and then extending them as long as they match; it's meant to be a first-pass lookup tool for people investigating and working with a single gene, but in practice people will gladly run millions or even billions of short fragments (which it's not supposed to be used for) with diverse evolutionary heritage (another problem: aligned scoring schemes are usually distance-dependent) through the wretched thing.

The result of this: it's accepted wisdom that there are far more different kinds of bacteria in any given environment than there actually are. (Although there's another story about falling in love with flawed market genes that I'll tell you some other time, the metagenomic analyses ostensibly corroborate them.)

If it helps any, think of each alignment as the log transform of a giant joint probability. Then you'll be unable to go back to just thinking of it as an abstract 'score.'
? Jipí der saz ûf eime steine
posts: 291
, Transition Metal, Marburg, Germany
message
Tate-blast.jpg

(This one was short-lived, though)
? Rhetorica Your Writing System Sucks
posts: 1292
, Kelatetía: Dis, Major Belt 1
message
This, right? I actually had a boxed copy of it (okay—just the manuals and box) that I found at a junk shop. No idea where it is now.
? Morrígan Witch Queen of New York
posts: 303
, Marquise message
Definitively NOT using Levenstein. Each segment in a sequence is represented by a multivalue feature vector. Right now, this is just a multidimensional vector distance weighted by the strength of each feature (they use different scales), but I'd prefer it represented a probability  (or -log thereof) that a pair was related) which is obviously more complicated to model.

I don't have metathesis explicitly implemented yet, but the algorithm already has a way of comparing short (length 1 to n, though anything above n=3 is absurd and even that is questionable), so a 2-2 comparison would compare segments where one underwent metathesis. There are cases where this is probably not sufficient though.

The ranking is an interesting problem, but conceivably that's the interesting problem. Given a set of correspondences and environments, I'll need to figure out a way to work backward and possibly re-run the alignments using new information based on discovering non-viable reconstructions, or discovering that the alignments we derived are somehow not viable.

The most important question is how this system performs when given garbage, viz. a set of chance resemblances between unrelated languages. I need to build an algorithm that can tell these apart, or at least determine that the relationship is a chance one.

I'll start a thread some time tonight if I'm able to get the time. I'm supposed to have dinner with my cousin, so who knows.
? Rhetorica Your Writing System Sucks
posts: 1292
, Kelatetía: Dis, Major Belt 1
message
what is it with SCA developers who have trouble with metathesis gah

I'm really glad to hear you're using a sophisticated featural vector approach. My next big bio project is going to be something very analogous; I started work on it a couple of years ago but scrapped it because I realised I was implementing the actual alignment procedure in an asinine way.

A neat trick I discovered which might help handling metathesis: convert your feature vectors to floating point (if they aren't already) and add a small fraction of the neighbouring phonemes to each value. This will 'colour' your letters according to their environment. I came up with this while looking into ways to compress multiple nucleotides into one slot for the first pass on a local search, relying more on the histogram than the basic sequence. (Although, obviously, that approach requires some caution to deal with frameshift.)

I look forward to your thread.
? Morrígan Witch Queen of New York
posts: 303
, Marquise message
Yeah! That's exactly what I'm doing actually, I just need to work on feature weights (or perhaps, probability of feature-transfer, something like that).

My SCA handles metathesis quite nicely, I'll have you know! I need to finish the manual for that too, I've been slogging through it for a month.
? Rhetorica Your Writing System Sucks
posts: 1292
, Kelatetía: Dis, Major Belt 1
message
More things I look forward to!
? bloodbath, Ph.D. Physicist and Numismatist
posts: 75
, Hydrogen, Ohio, USA
message
So, last fall, I applied for a grant from the FNR (Luxembourg Research Foundation). I got rejected. I was sad.

I reworked and tweaked the application and decided to try again for this spring, especially since I felt like last time was a practice run. My application was "retained for funding".

This means I'm moving to Luxembourg in the foreseeable future, but the exact timeline is dependent on how the process of getting my residence permit goes. But this means I'll be finishing up my PhD work over there, which makes me rather happy.
? hwhatting posts: 105
, Sophomore message
Congratulations!
? Morrígan Witch Queen of New York
posts: 303
, Marquise message
That's awesome! Congratulations~
? twabs fair maiden
posts: 228
, Conversational Speaker, /ˈajwʌ/
message
I have made indian bread!

2015-10-10%2019.14.46.jpg

Then I ate it all, and now there is naan left :-(
? hwhatting posts: 105
, Sophomore, Bonn, Germany
message
Pun 'em et circenses...
? bloodbath, Ph.D. Physicist and Numismatist
posts: 75
, Hydrogen, Ohio, USA
message
I am now in Luxembourg. There's been a lot of running around since moving here a week ago with administrative stuff and some practical things, but things are coming along nicely. I think.

Big things are that I need to buy some chemicals soon so I can start researching, order a computer for my office, and finalize my titre de séjour, the latter of which is going to be a big bother (yay for medical exams). But hopefully all will be great and I can then soon book my trip to India for my friend's wedding.
? Izambri Left of the middle
posts: 969
, Duke, the Findible League
message
Kentaro Miura's Berserk is being translated into Catalan. I can't be happier.

    Published as 'Maximum' edition; three volumes issued right now, bimensual.

          Happiness.
? Hallow XIII Primordial Crab
posts: 539
, 蘇黎世之侯
message
May he rest in peace!
? bloodbath, Ph.D. Physicist and Numismatist
posts: 75
, Hydrogen, The Nether Regions
message
I realize one of my last posts in this thread was getting a grant to go to Luxembourg. Well... it turns out I got a grant to go back to Luxembourg.

Summary: There's a highly competitive grant program here in Europe called Marie Skłodowska-Curie, which is a two-year grant for a postdoctoral project. I applied four years ago for a grant to stay in the Netherlands and was spectacularly unsuccessful. I submitted the application for this round back in September, writing my own project for wearable technology development, but this time with a lot of help from the prospective host institute. Well, the results were announced on Monday. To my immense surprise, I ended up with a very high score (99.4/100), which means that I'll be moving back this summer (immigration procedures willing).

Pretty red passport, here I come.
? hwhatting posts: 105
, Sophomore, Bonn, Germany
message
Congratulations!
previous 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 next end
notices