what is it with SCA developers who have trouble with metathesis gah
I'm really glad to hear you're using a sophisticated featural vector approach. My next big bio project is going to be something very analogous; I started work on it a couple of years ago but scrapped it because I realised I was implementing the actual alignment procedure in an asinine way.
A neat trick I discovered which might help handling metathesis: convert your feature vectors to floating point (if they aren't already) and add a small fraction of the neighbouring phonemes to each value. This will 'colour' your letters according to their environment. I came up with this while looking into ways to compress multiple nucleotides into one slot for the first pass on a local search, relying more on the histogram than the basic sequence. (Although, obviously, that approach requires some caution to deal with frameshift.)
I look forward to your thread.