corpus-joyce-ulysses-tei Loanwords / compound words / Joyceanisms

‘Proteus’ is an episode of edge cases. Today’s dilemma: loanwords.

As the cockle pickers pass Stephen on their way from the shore-line, he thinks to himself:

She trudges, schlepps, trains, drags, trascines her load. (U 3.392–93; my emphasis)

How would we encode this multilingual description in which translations for the verb ‘to drag’ from Yiddish / German (shlepn / schleppen), French (traîner), and Italian (trascinare) have been ‘Englished’ or anglicized† in Stephen’s interior monologue?

† OED has intr. to coin an English word by borrowing from another language (rare).

None of these non-standard words is italicized in the reading text so we’d put an @rend="none" attribute on our tag. But what element does the Guidelines suggest for loanwords? <foreign> is clearly out of place, since Stephen is borrowing from non-English languages into English (he applies English verb conjugation to his borrowed verb forms). Cf., in this vein:

Number one swung lourdily her midwife’s bag (U 3.32; my emphasis)

I’m sure this phenomenon is not limited to ‘Proteus’ or to Stephen’s interior monologue. If we start to encounter it all over the corpus, it might be worth marking up.

Jan 24 '17 15:01 yellwork

I really like this idea. Let's do it. The only thing I can think of would be something like <seg type="loanword" subtype="fr">.

This seems related to the neologism/Joyceanism/compound word markup we're doing for Portrait in https://github.com/JonathanReeve/corpus-joyce-portrait-TEI/issues/36. For those, we're using <seg type="neologism">, but I can imagine getting more descriptive, and using <seg type="neologism"> for true Joyceanisms that don't have clear etymologies, and <seg type="compound"> for compound words.

If that sounds good, we can change the title of this issue to cover loanwords, compound words, and Joyceanisms.

Feb 15 '17 23:02 JonathanReeve

What about using something like distinct?

distinct identifies any word or phrase which is regarded as linguistically distinct, for example as archaic, technical, dialectal, non-preferred, etc., or as forming part of a sublanguage.

We could @type this to include values like "loanword" "non-standard compound word" (or an abbreviation) "archaism" &c. Admittedly, implementing this tagging is likely a while off yet and a task that would require a dedicated crowd of encoders.

Feb 20 '17 11:02 yellwork

<distinct> sounds great. Let's do it. I might know someone that might be interested in helping out with this--the maintainer of the Joyce Word Dictionary. Loanwords, compound words, dialectical words, and related words would be good to track. I agree, though, that this seems low-priority.

Mar 13 '17 21:03 JonathanReeve

This is what I've been thinking: <distinct type="X">, where X is one of:

compound: the word exists in the OED, but in its hyphenated form, except words whose only citations are from Joyce
nonstandard-compound: a compound word not found in the OED, even in hyphenated form, but composed of two words that are found in the OED
dialect: nonstandard dialect, slang, etc, whether or not it's found in the OED
- for this, we can use @space to distinguish the dialect further, where it has a particular associated place to it (diatope in the TEI docs)
archaism: archaisms generally out-of-use around the turn of the century
- for these, we can use @time to specify the associated time
Joycean: for obvious Joycean coinages, or other distinctive words that don't fit in the above categories, but seem to belong to no other obvious linguistic or lexicographical group, either.

Mar 19 '17 15:03 JonathanReeve

Sounds great, Jonathan; I particularly like compound and nonstandard-compound as values. I think this system covers most of our cases. The only other instances that might be worth flagging are (1) Joyce’s extensions of the meaning of a pre-existing word – a subtype of @type=Joycean perhaps? – and (2) his use of an obsolete or archaic sense of an otherwise still-current word (subtype of @type=archaism perhaps?). I’m sure, too, that we’ll encounter combinations of the @type values along the way: Elizabethanisms that survive in Hiberno-English, for example.

(1) Joyce extends ‘welsh comb’ [n. the thumb and four fingers] to include a verb form:

<p><lb n="070331"/>He [Simon Dedalus] took off his silk hat and, blowing out impatiently his bushy
<lb n="070332"/>moustache, welshcombed his hair with raking fingers.</p>

(2) Stephen imagines himself in a ‘medley’ drawing on the noun’s first sense in the OED:

A. n. I. The mixing or mingling of people in combat.

Combat, conflict; fighting, esp. hand-to-hand fighting between two groups of combatants. Also: an instance of this; a war, battle; a tournament; a quarrel. Also fig. Cf. mellay n. 3, mêlée n. 1. Now rare (arch.).

<p><lb n="020314"/>Again: a goal. I am among them, among their battling bodies in a
<lb n="020315"/>medley, the joust of life.

Would it be an idea to bring Natasha into the conversation? Also, if we find we need to disambiguate your list further at some point down the road, well then, so be it.

Mar 20 '17 14:03 yellwork

This is fun as an intellectual exercise – he says, having started the thread – but it’s also more or less academic until we start the business of encoding.

In that vein, have we any way of filtering out all the non-<distinct> words? Running the corpus through a few different spell checkers might reduce the total lexicon down to something more manageable, for example. Or could we cross reference the lexicon with the headwords in P. W. Joyce’s English as we speak it in Ireland in order to single out the Hiberno-English? Has anyone looked at the Oxford Dictionaries API? Wonder is there a way to harness it?

What other strategies can we come up with? <distinct> tagging is so potentially massive that I suspect we’d want to figure out ways of getting a sizable amount of it automated in order to produce any kind of credible results or get anywhere near completeness.

Mar 20 '17 14:03 yellwork

Hi Jonathan and Ronan,

This is indeed a fun exercise! Another person worth consulting about how to categorize Joyce's neologisms would be Elizabeth M. Bonapfel. She presented a brilliant paper on the topic on the Joyce panel I organized at this year's MLA. But I think the above ideas are terrific, and bode well for the TEI editions. I am new to coding, and eager to get to work. Going to get set up in the coming days.

Mar 28 '17 12:03 NChenier

Great stuff, Natasha. Once we are all happy with the encoding conventions for <distinct> words in the corpus, we’ll make the rules easily accessible in the project CONTRIBUTING.md file.

I know Elizabeth well and thought of her in the context of Joyce’s word-compounding. I’ll give her a buzz and direct her here!

Mar 29 '17 17:03 yellwork

Also, @NRChenier, I meant to ask: have you found any useful compendia / articles / glossaries of Joyce’s non-standard compound words or neologisms etc? I’m wondering if some of the task of tracking down these instances of <distinct> hasn’t been done for us already (in Joyce crit. over the last sixty years or so). Or have you any ideas how we might isolate the terms? Cheers!

Mar 29 '17 18:03 yellwork

Sounds great @yellwork . And as per useful glossaries: none that I am aware of, besides that put forth by Elizabeth in the paper she presented. It is v helpful! Let's get her in here. You will contact her? I would also be happy to.

And good question re: isolating Joyce's terms. As far as I know, nothing substantial has been done on this front, in the digital realm, as yet. It's a tough thing to do, a) because choosing what sources to cross-reference with presents a number of dilemmas (I can talk your ear off about the problems with the OED, for example, but will spare you for now!!), and b) choosing which of Joyce's words are to be considered neologisms is equally complex. We need to make very thoughtful, deliberate decisions on both fronts before beginning our work.

Andreas Fischer's essay "'Milly Bloom, fairhaired, greenvested, slimsandalled': Joyce's compound adjectives and the OED" in A Collideorscape of Joyce is well worth a read.

Let us know what you think @JonathanReeve .

Apr 03 '17 14:04 NChenier

Hey everyone! I'm going to add some of these tags in Telemachus.

Mar 14 '18 13:03 JonathanReeve

corpus-joyce-ulysses-tei corpus-joyce-ulysses-tei copied to clipboard

Loanwords / compound words / Joyceanisms

corpus-joyce-ulysses-tei
corpus-joyce-ulysses-tei copied to clipboard