Sound Change Appliers

Anthologica Universe Atlas / Forums / Miscellaneria / Sound Change Appliers

Morrígan Witch Queen of New York
posts: 303
, Marquise message

Yeah, so nevermind that won't work, I didn't think hard enough about how thing would like up. However

Cn Cm > nC mC

will work, as long as you enumerate one of the sets.

9 years ago link

Rhetorica Your Writing System Sucks
posts: 1292
, Kelatetía, Koitra, Illera
message

...and that is why I wanted to make klank support metathesis as a core feature. It's really, really obnoxious to have to enumerate things for larger and likely-to-change sets.

9 years ago link

Morrígan Witch Queen of New York
posts: 303
, Marquise message

I don't think it should be hard to add that functionality, but I'm not sure what is a good way to do it. Unless I do something like in regex, maybe like

(C)(N) > $2$1

Or specifying a particular metathesis rule, but the above has more general applicability.

edited once, last update 9 years ago link

Rhetorica Your Writing System Sucks
posts: 1292
, Kelatetía, Koitra, Illera
message

Take it a step further: permit class permutation at the same time.

(C)(N) > $C2$N1

Where the index in the set N is used to look up the index in C. So if C = {p t k} and N = {m n ŋ} this would swap the place of articulation without disturbing the mode.

If you add this, then I can retire happily and not write klank at all, confident that you've got all the bases covered, and it'll just be a matter of writing a wrapper module for the HT SCA and building a front-end that integrates into Anthologica. (which will be waaaay easier than writing a whole darn new SCA.)

I think the only other feature we were considering (that I doubt anyone's tried before) was attempting automatic syllable break detection.

9 years ago link

Morrígan Witch Queen of New York
posts: 303
, Marquise message

That's an interesting idea, but it also means that both sets have to have the same number of entries. That might be a reasonable extension of the approach under discussion. I've already created a ticket for this on Github.

Syllable-break detection is an interesting problem. I think providing a syllable template would give us a good place to start. The biggest problem is that none of my tools actually handle supersegmental features correctly, or know what a syllable is.

9 years ago link

Rhetorica Your Writing System Sucks
posts: 1292
, Kelatetía, Koitra, Illera
message

Sounds good! And, actually, I have one more request, maybe; the ability to specify multiple rules files and have the outputs be saved in an interleaved format. E.g.

word 1 ruleset 1
word 1 ruleset 2
word 1 ruleset 3
word 2 ruleset 1
word 2 ruleset 2
word 3 ruleset 3

... etc.

This would be greatly preferable to running the SCA a dozen times for a basic inflection profile.

A lot of times, the program may only get run with one or two words, so cutting down on the number of individual runs for when many rulesets need to be applied (e.g. inflecting in a parent language, then transmitting to several daughter languages, then inflecting them there, too) would be highly desirable.

edited 2 times, last update 9 years ago link

Nessari ?????? ?????? ????????
posts: 932
, Illúbequía, Seattle, Cascadia
message

Programmers intensely interested in SCA creation and use working together

hD8DC8296

edited once, last update 9 years ago link

Rhetorica Your Writing System Sucks
posts: 1292
, Kelatetía, Koitra, Illera
message

Well, um, not to let things get too out of hand, but, uh...

Maybe?

edited once, last update 9 years ago link

Morrígan Witch Queen of New York
posts: 303
, Marquise message

I have the processing model set up to use file-handles

already.

right now you load one lexicon and it gets run through DEFAULT. But the lexicons are actually stored in a map, even though there is only one of them right now. I'm interesting in letting users do a couple of things, like:

EXECUTE other rule files, which just does what that rule file does in a separate process;

IMPORT other rule files, which basically inserts those rules into your current rule file;

OPEN "some_lexicon.txt" (as) FILEHANDLE to load the contents of that file into a lexicon stored against the file-handle;

WRITE FILEHANDLE (as) "some_output1.txt" to save the current state of the lexicon to the specified file, but leave the handle open;

CLOSE FILEHANDLE (as) "some_output2.txt" to close the file-handle and save the lexicon to the specified file.

edited 3 times, last update 9 years ago link

Nessari ?????? ?????? ????????
posts: 932
, Illúbequía, Seattle, Cascadia
message

quoting Rhetorica:
Well, um, not to let things get too out of hand, but, uh...

Maybe?

edited once, last update 9 years ago link

Morrígan Witch Queen of New York
posts: 303
, Marquise message

I've been quite busy and had a little time to work on the feature system. However, I did make a quick implementation of something that should be fairly useful and was easy to do.

I've added support for OR and NOT in rule conditions. This way, when two transformations might otherwise share the same condition, they can be expressed as a single rule, rather than two. The following rules will apply just at the should be expected to:

a > b / x_ OR _y

a is changed to b wherever it follows x or wherever it precedes y.

C = x y z
a > b / C_ NOT y_

Similarly a is changed to b wherever it follows x and z, but not where it follows y.

The updated manual for this is available here:
https://github.com/sfmorrigan/toolbox-sca/tree/Compound-conditions

The jar for this release is available here:
https://drive.google.com/file/d/0B0yPt9F6jiAbMUozSmNqQlNtckU/edit?usp=sharing

These changes are not yet merged into the main branch and will remain separate until I've been able to more adequately test the functionality. The changes are fairly simple and I do not expect any complications.

edited once, last update 9 years ago link

Hâlian the Protogen
posts: 142
, Alípteza, Florida
message

Who's awesome?

You're awesome.

9 years ago link

Nortaneous ? ?????
posts: 467
, Marquis, Maryland
message

will test this with the V'eng -> Vian sound changes. IS THIS SCA A BAD ENOUGH DUDE TO TONOGENESIS

9 years ago link

Nortaneous ? ?????
posts: 467
, Marquis, Maryland
message

Is there a way to get the SCA to spit out all the intermediate changes? (Words should not have two tones and I am not sure where I fucked up.)

9 years ago link

Morrígan Witch Queen of New York
posts: 303
, Marquise message

quoting Nortaneous:
Is there a way to get the SCA to spit out all the intermediate changes? (Words should not have two tones and I am not sure where I fucked up.)

Not really, though I've thought about that. I can include a BREAK command to terminate parsing in the middle of a file, so you can easily stop it without having to comment out half the file. Also, I could put in a lot of debug statements, since I'm using slf4j and just have it print a log.

Hopefully though, she is a bad enough dude to tonogenesis.

UPDATED: I added BREAK - if you use it, the SCA can't see anything after the BREAK command.
https://drive.google.com/file/d/0B0yPt9F6jiAbaUxReW9WUjN2Y3M/edit?usp=sharing

edited once, last update 9 years ago link

Morrígan Witch Queen of New York
posts: 303
, Marquise,
message

Not really an update (theorypost?), but I'm working on the metathesis functionality. Still gotta do some work, but I have an approach that should be fairly efficient and also powerful. Conceptually, it's straightforward so I can probably get it done this weekend. It does involve some substantial refactoring of the rule-application code, but I already have a lot of unit tests that should cover the relevant tests-cases.

So, basically the way we are gonna do it is as follows: when you write a rule, each variable reference will be automatically indexed, so in CN > $2$1, the SCA will notice C first and index it as 1. Then the SCA basically checks if the source pattern CN matches anywhere in the word. In the process of determining if it is a match, it will keep track of which element of C it starts with, which is stored and used to generate the new string.

Not 100% what the notation will be for using the same index on a different variable, but perhaps CN > N2G1 or CN > $N2$G1. When parsing the rule, the SCA can ensure that C and G have the same number of elements.

I haven't attempted to do this yet, but it might also be possible to have SCA infer the mapping, so that if you write CN > NC it can tell that 1) you've used the same symbols and 2) if C and N have different sizes, that you've used different variables which have the same numbers of elements. This could be overkill though, and seems likely to produce unintended consequences.

edited once, last update 9 years ago link

Morrígan Witch Queen of New York
posts: 303
, Marquise message

More Theoryposting

I've made some progress on implementing metathesis, but the process involves rewriting a lot of the code that does rule application, and there are some complications (the same ones I've had before, mostly - setting the right conditions under which to update the cursor).

But it occurred to me that I might be duplicating work: if I implement what is described in the above post, then I can treat the source pattern (the first part of the transform statement, on the left side of the > ) as a state machine. This would also allow us to make use of some advanced functionality here, though obviously more power can be dangerous. Ensuring that backreferences behave correctly and such could be a challenge. I might just define a subset of the expression language that excludes *, +, and ? metacharacters.

Thoughts, considerations, rude remarks?

edited once, last update 9 years ago link

Nessari ?????? ?????? ????????
posts: 932
, Illúbequía, Seattle, Cascadia
message

9 years ago link

Morrígan Witch Queen of New York
posts: 303
, Marquise message

Basically.

9 years ago link

Morrígan Witch Queen of New York
posts: 303
, Marquise message

Metathesis - Beta

https://drive.google.com/file/d/0B0yPt9F6jiAbc3Q3M2tEclZ4MWs/edit?usp=sharing

I changed the way that rules are parsed and applied, and got all of the unit tests working. Right now, you should be able to do metathesis like

CN > $2$1

in order to reverse the order of two segments. You can also do longer-distance things like

CVN > $3V$1

which will swap C and N but leave V; the statement CVN > $3$2$1 is equivalent in this case. Just remember that the index number after the $ sign counts every variable, and starts at 1.

It should be fairly easy to add further functionality so that

CN > $2$G1

will be possible, so that you can (in this case) do metathesis and change voice at the same time, provided of course that G and C have the same number of elements.

edited once, last update 9 years ago link

previous 1 2 3 4 5 6 7 8 next end

notices

return to Miscellaneria