Wiktionary:Etymology scriptorium/2014/March

From Wiktionary, the free dictionary
Jump to navigation Jump to search
This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live.

Moved from RFV:

Only Germanic descendants are given, which isn't enough to establish this as a root of PIE origin. It also doesn't really establish it as a root at all, because there is only one descendant word, not several with related meanings. However, the "what links here" list does show that other entries also link to this, so the entry may be valid, but incorrect or incomplete. —CodeCat 16:45, 27 August 2013 (UTC)[reply]

Seems the main Germanic stem is represented by *swerdą (sword) and *sweraną (to hurt; fester), with *swerô (boil, ulcer) being a derivative of the latter. Would Avestan xvara count as a cognate? Leasnam (talk) 19:02, 27 August 2013 (UTC)[reply]
How would rfv even work for an unattested word? Would we have to look for examples of the word not being attested? If so it fails to appear in every text ever, so it's clear widespread use (literally every durably archived piece of writing ever). What I'm really saying is this isn't an RFV matter. Mglovesfun (talk) 20:51, 27 August 2013 (UTC)[reply]
It doesn't have anything to do with attestation but with verification. I am asking others to verify the correctness of this entry. —CodeCat 21:01, 27 August 2013 (UTC)[reply]
You haven't read Wiktionary:Requests for verification/Header, which is a bit frightening if even you haven't read it. Has anyone? Mglovesfun (talk) 21:03, 27 August 2013 (UTC)[reply]
Now that this thread has started here, I guess it can stay here, but in general I'd say the etymology scriptorium is the place to discuss the accuracy of reconstructed forms. —Angr 22:01, 27 August 2013 (UTC)[reply]
I think we've had these at WT:RFDO before. I don't see by what criteria we're supposed to assess validity though. Reconstructed forms really need their own CFI as obviously if they meet WT:CFI#Attestation then they're not reconstructions at all. So, by what criteria? Mglovesfun (talk) 22:17, 27 August 2013 (UTC)[reply]

Moved from RFV:

No reference was given. Can anyone supply one? --Pereru (talk) 15:12, 29 September 2013 (UTC)[reply]

The presence of both the long vowel ē and the change dt > st are clearly attested in the descendants. And there are sources for the infinitive ending -tei too. —CodeCat 16:20, 29 September 2013 (UTC)[reply]
Sure. All we have to do is add sources that make these claims explicitly. Or else, per BP discussion, the page would have to be deleted (at least if I understand well the result of the discussion). --Pereru (talk) 00:13, 30 September 2013 (UTC)[reply]
I'm not sure if it would be possible to find any for the first, as it's just straightforward PIE ē > PBSl ē > Baltic / Slavic ē. Linguistic sources usually focus on changes, not retentions; it's assumed that anything that isn't changed is retained. However, it might be that the ē arose by Winter's law. w:Winter's law mentions both the lengthening and the dissimilation: -edt- > -ēst-. It also mentions the ending (spelled as -tej) but for a source see {{R:Kim PBS}}. The article mentions that Winter's law gave a rising accent, whereas original PIE ē gave a falling accent, so the accent that is actually found in Balto-Slavic would tell us how the ē arose. —CodeCat 00:31, 30 September 2013 (UTC)[reply]
If someone wants playing reconstruction, here's my translation from Vasmer: Russian "ем" (1st person singular present of "есть" (to eat)), Ukrainian ïм, ḯсти, Belarusian ем, есць, Old Church Slavonic мь, сти, Bulgarian ям, Serbo-Croatian jе̑м, jе̏сти, Slovene jė́m, jė́sti, Czech jím, jísti, Slovak jem, jest', Polish jem, jeść, Upper Sorbian, Lower Sorbian jěm, jěsć.
ORIGIN: Proto-Slavic *ědmь, invinitive *ěsti, Middle Lithuanian ė́du, ė́mi, ė́dmi 1, ė́džiau, ė́sti "eat (of animals, cattle)", Latvian ę̄du, êst, Old Prussian īst "to eat", Old Indic ádmi, átti "to eat", Armenian utem "I eat" (from Indo-European *ōd-; see Bartolome(? spelling), IF 3, 15), Greek ἔδω, ἔσθίω, ἔσθω, Latin edō, ēdī, Gothic itan, past tense at, 1st person plural ētum, Old High German еʒʒan "to eat".. --Anatoli (обсудить/вклад) 01:05, 30 September 2013 (UTC)[reply]
Long root in Latin is by Lachmann's law, in Balto-Slavic by Winter's law, further confirmed by acute accent in Slavic and Lithuanian. For Hittite, Kloekhorst dismisses the Narten theory altogether because attested forms show ablaut (zero-grade) in Old Hittite, and only in newer (Middle Hittite) texts being replaced by full-grade forms. I.e. the original PIE paradigm being a normal root-present with *e/Ø- ablaut *h₁édti / *h₁dénti. --Ivan Štambuk (talk) 01:38, 30 September 2013 (UTC)[reply]
  • There is also the issue of Balto-Slavic infinitive ending, because in this case we also have Old Prussian cognate īstwei which can't really derive from Balto-Slavic *ēstei as was argued here (the Old Prussian infinitive ending -twei is problematic). Perhaps Balto-Slavic verbs should be simply formatted as roots, i.e. *ēd- (or *ḗd-). --Ivan Štambuk (talk) 02:10, 30 September 2013 (UTC)[reply]
    • On that same page I argued that while we can't be sure that -tei was used as the infinitive or even as an infinitive, we can be sure that it existed in Proto-Balto-Slavic and was used in some verbal function. So I don't see any reason to leave it out. When we cite descendants we cite whole paradigms, not only the matching lemma forms. We list Bulgarian first-person singular forms as descendants of Proto-Slavic infinitives. We list Romance infinitives as descendants of Latin first-person singular present active indicative forms. We list Afrikaans infinitives as descending from Dutch infinitives even though they descend from the first-person singular present. And we list descendants of PIE terms in the nominative case's root grade (*méntis) even when the descendant has the genitive's grade (Balto-Slavic *mintis, Germanic *mundiz). So there really is plenty of precedent for lemmas that are not exact phonological descendants. —CodeCat 12:56, 1 October 2013 (UTC)[reply]
      We can't be sure that *-tei/*-tej/*-tey was used as an/the infinitive, and you see no reason to leave it out? I don't quite comprehend your logic. If there is no agreed lemma form for a protolanguage, we should be using the most neutrally cited form, i.e. the present stem. --Ivan Štambuk (talk) 14:02, 1 October 2013 (UTC)[reply]

Moved from RFV:

I tagged this page because it had no references. If someone has a good reference, please add it. Or else, the page would eventually have to be deleted. --Pereru (talk) 23:33, 13 October 2013 (UTC)[reply]

I've added Max Vasmer. Will that do? The Latvian and Lithuanian examples could be expanded and glossed, if that's a concern. --Anatoli (обсудить/вклад) 23:53, 13 October 2013 (UTC)[reply]
But there is no mention of *źambas in that entry in Vasmer's dictionary. In fact, there is no mention of a Proto-Balto-Slavic etymology there, as far as I could see. Isn't there a source that actually mentions *źambas as Proto-Balto-Slavic? --Pereru (talk) 13:43, 14 October 2013 (UTC)[reply]
I've provided references now. —CodeCat 13:50, 14 October 2013 (UTC)[reply]
But your references are for the supposedly involved sound changes, not the reconstruction itself. --Ivan Štambuk (talk) 14:12, 14 October 2013 (UTC)[reply]
  • I've now moved the page to the form that can be referenced. However, the new spelling reflects a competing reconstruction of Proto-Balto-Slavic (that advocated by Frederik Kortlandt), where as the original spelling (*źambas) is advocated by some others. I'm not sure how to reconcile the two (and possibly other differing reconstructions as well) within a single page name. NPOV requires us not to give precedence to either theory. --Ivan Štambuk (talk) 14:27, 14 October 2013 (UTC)[reply]
    • I've moved the page back as it was premature and there was no consensus for a move. I think you are being way too forceful with Balto-Slavic and Slavic reconstructions in imposing your own view of things. The very thing that you accuse me of doing, it seems. The whole idea of our earlier proposal regarding research was to avoid situations like this, but apparently you just want to go ahead and push through your own ideas? —CodeCat 14:39, 14 October 2013 (UTC)[reply]
      It's perfectly OK to remove any objectionable content until evidence is presented supporting it. The burden of evidence is upon you, not me. This is not a matter of achieving consensus, since this is not a content or formatting dispute; it's a simple dispute of facts. Are there any references for *źambas or not?
      You also restored your inflection template for Proto-Balto-Slavic nouns, which is entirely made up. There are half a dozen different proposals by researchers for almost any of these desinences. You create them as if you're creating a conlang - combining bits and pieces from various theories, which cannot work. It's OR and must go. --Ivan Štambuk (talk) 14:54, 14 October 2013 (UTC)[reply]
        • My disagreement is over whether we need to find a reference that gives exactly the form *źambas. You assume that we do, but I do not agree with that assumption. You also assume that all OR must go, blatantly disregarding the prior discussion which established that, at the very least, there is no consensus on OR. So to act as though such a consensus exists is POV pushing. —CodeCat 15:22, 14 October 2013 (UTC)[reply]
          • [1] - You've again removed the citation-needed tag (this time {{rfv-etymology}}, which specifically targeted the *źambas form), assuming that a link to a paper describing sound changes within one particular researcher's framework would suffice. The request was and still is, to provide a citation to the reconstruction *źambas. How it s derived is immaterial.
          • The discussion regarding OR on Wiktionary was left as inconclusive due to insufficient user input. The only one pushing for OR in protolanguages as far as I can see is you. If the content is controversial, and there is no consensus on it, it must not be entered in articles. That it was inserted in articles before it was disputed is irrelevant. --Ivan Štambuk (talk) 15:55, 14 October 2013 (UTC)[reply]
            • I still dispute your claim that a source for the reconstruction *źambas is necessary to maintain the page. You still haven't given any reason why it's needed, you just keep requesting it and dismissing the sources I have given. If you don't agree with the sources then fine, but you cannot invoke OR as a valid reason for dismissing the references unless you can demonstrate clearly that there is a consensus that it is a valid reason. And so far the discussion hasn't led to any clear consensus. So this request as been satisfied as there is nothing indicating it hasn't been, except for your own personal refusal. —CodeCat 16:14, 14 October 2013 (UTC)[reply]
              It's needed because apparently you made it up. What part of that is unclear to you? There are Balto-Slavists that reconstruct it differently, and whose cited reconstructions you moved to your made-up reconstruction.
              You keep invoking the consensus clause but it doesn't matter, because it's not up to us to reach a consensus whether the reconstruction can be cited or not - it's a fact we can merely establish or refute. Whether the reconstruction itself is valid within a particular framework of reconstructing Proto-Balto-Slavic is irrelevant. --Ivan Štambuk (talk) 17:06, 14 October 2013 (UTC)[reply]
              Of course it's up to us. We are all part of the Wiktionary community and therefore we are the ones forming a consensus on what is right for Wiktionary or not. That the reconstruction can be cited verbatim or not is something we can establish or refute, yes. But what is not given is that any reconstruction that is not cited verbatim is therefore irrelevant. I don't see where you got that from and as far as I can tell you just made that up. So I am challenging it in order to form a consensus on what should be done. This RFV is not yet closed until a consensus exists and as long as it's only the two of us disagreeing, no consensus will ever be formed so this RFV will have no value. Yet you apparently think that the matter is already decided - a consensus that has not in fact been formed - and start moving and deleting pages. That's bad form, just like it would be if a page were deleted while its RFD discission was still in full swing. —CodeCat 20:27, 14 October 2013 (UTC)[reply]
              Made-up reconstructions are the same category as made-up words. No references = original research, it cannot stay. You have twice moved an attestable reconstruction, that is cited from a work written by a scholar in the field, to the one that is disputed. I'm hardly the only one questioning your actions (Note that it wasn't me who started this discussion). Now your argument for moving back is that the discussion is still ongoing. If you fail to deliver a citation, would you agree that the article should be deleted? --Ivan Štambuk (talk) 23:40, 14 October 2013 (UTC)[reply]
              No, because reconstructions by definition can't be cited in use. The criteria are different, but as was evident from the prior discussion, there is no consensus on what the criteria are. A reference to *źombos is ok for the page *źombos, but you keep moving the page *źambas there without prior discussion, and in fact while a dispute about it is ongoing. We don't delete pages until the RFV fails, period, and this RFV is still ongoing because there is no agreement on what criteria are needed to satisfy it. If you want to create *źombos then do so, but do NOT move or delete *źambas until the dispute is sorted. —CodeCat 02:23, 15 October 2013 (UTC)[reply]
              The criteria are the same: citations in usage for attested words is the same as citations without usage for reconstructions. But you want to take it further - by claiming that no reconstruction can be attested in usage, which cannot be done due to their nature (them being reconstrucuted), every reconstruction is equally valid, regardless whether it comes from an established scholar or User:CodeCat. Why don't you try putting your made up reconstructions on Wikipedia? Whoops, it would get reverted because it's OR. So instead you come here and contaminate the project with your inventions. (Of which I have nothing against to be clear - I would've reconstructed PBSl. *źambas as well, it's the principle that matters, and the "extra info" that you add such as dubious paradigms). No we won't delete the page, but I'll delink it from any of the mainspace entries given its OR nature. You can keep insisting on sorting the dispute and reaching a consensus. --Ivan Štambuk (talk) 03:22, 15 October 2013 (UTC)[reply]
              So you're saying consensus is irrelevant? Then why are you even editing Wiktionary? Is it for your own personal principles or those that everyone can agree on? —CodeCat 03:29, 15 October 2013 (UTC)[reply]
              But you don't really seek consensus. You just push your own way without caring for alternatives. Old Church Slavonic зѫбъ (zǫbŭ) had back in 2007 etymology added that gave *źombos for Proto-Balto Slavic [2]. Which you changed in 2013 [3], in an edit marked as minor. Now that *źambas has been challenged you claim that it represents a former consensus, and that it should stay unless a consensus is reached to delete/move it. IMHO you're just abusing the procedure to push your uncitable reconstructions everywhere. Yes it's sad that we don't have policies laid out yet to forbid such activities - but there was no need to do so because there was no abuse at this scale until now. --Ivan Štambuk (talk) 03:40, 15 October 2013 (UTC)[reply]

Delete, unless a reference is provided for this headword. OR in reconstructions should be forbidden. How do I know I can trust User:CodeCat's competence? --Vahag (talk) 07:00, 15 October 2013 (UTC)[reply]

I provided a reference for all the sound changes? Linguistics is not a black box you know, you're meant to understand it to be able to use it. If I were an outsider, I'd find the etymology I gave far more convincing than the one at *źombos, which really tells me nothing except "person X said that this is the right form". The etymology I gave actually accounts for everything, so that's a better and far more linguistically comprehensive reference. —CodeCat 13:20, 15 October 2013 (UTC)[reply]
Since this is an unattested term, it cannot be solved through an RFV. This cannot be cited per WT:CFI, if it can it needs to be moved into the main namespace. Mglovesfun (talk) 16:07, 15 October 2013 (UTC)[reply]
Thank you!! —CodeCat 16:13, 15 October 2013 (UTC)[reply]
But, Mglovesfun, where should the removal or non-removal of Appendix pages be discussed? --Pereru (talk) 01:08, 16 October 2013 (UTC)[reply]
I think we usually deal with Appendix pages at WT:RFDO, but it might be better to deal with these at the Etymology Scriptorum like we do with contested etymologies- they are, after all, sort of an extension to the etymology sections. Either would be better than rfv. Chuck Entz (talk) 06:32, 16 October 2013 (UTC)[reply]
I find it ironic that you cite Derksen when he ultimately reconstructs it as *źombos not *źambas. If you really don't see anything problematic with that, then I'm afraid there is not much left to discuss. --Ivan Štambuk (talk) 02:15, 16 October 2013 (UTC)[reply]

I see there still is disagreement on when to delete or not to delete pages on reconstructed protoforms, OR vs. no OR, etc. As far as specific reconstructions like *źambas vs. *źombos go, here is a suggestion: why not allow both pages to remain, one with the references to where the form is actually proposed (*źombos, by Dirksen), and the other, *źambas, with a tag indicating that it is "a Wiktionary proposed form" (placing it in, say, Category:Protoforms proposed by Wiktionarians or something like that) and with the supporting arguments (like the series of sound changes with references that CodeCat provided), either in the page itself, or elsewhere, with a link to them. If this is done, it will be easy to see which reconstructions were proposed by practicing Indo-Europeanists and which were not, so those who don't want to read original Wiktionary research can avoid it. Does that seem OK to y'all? (Of course, the question remains whether one should cite *źombos or *źambas, or both, in the etymology section of specific words, like Latvian zobs). --Pereru (talk) 01:08, 16 October 2013 (UTC)[reply]

The problem is that the sound changes involved are cited by two different authors (Matasović and Derksen) who endorse completely different reconstruction of Proto-Balto-Slavic. User:CodeCat is cherry picking pieces of different theories, forming a new one that is supported by neither of the cited researchers, including the reconstruction of desinences in the inflection template. To me that appears as misguiding the reader. OR should not be cited like that to make it artificially appear as legitimate. There should be an appendix page "User:CodeCat's reconstruction of Proto-Balto-Slavic" where all of the details of their preferred form of protolanguage are outlined, with all of their reconstructions conforming to them. It makes no sense to individually discuss sound changes involved in particular etymologies like that. It comes as a package, not as individual reconstructions. --Ivan Štambuk (talk) 02:08, 16 October 2013 (UTC)[reply]
I'm not a specialist, so I cannot opine... But if indeed she is cherry-picking between two theories without distinguishing which is which, then I would agree with you that it does misguide the reader. But then -- would you agree that there is a need for a discussion (perhaps a vote?) on the criteria for eliminating and/or keeping pages with specific reconstructed proforms? Shouldn't a specific proposal be made, and voted on (like the proposal I made in our last BP discussion, or some variant of it)? The way things are right now, it's difficult to even see whether this RfV will lead to any result... Shouldn't the people here who care about etymologies gather and talk this through? --Pereru (talk) 02:43, 16 October 2013 (UTC)[reply]
Certainly there is a need. My preferred solution would be to keep the two completely separated - "official" etymologies where up-to-date scholarship is compiled, and a separate set of articles where Wiktionarians can play Saussure to their heart's content. (And which I have absolutely no desire browsing, let alone editing.) The latter should have a big banner describing their purpose, and shouldn't be linked from any entries in the main namespace due to their uncertain and amateurish nature. But User:CodeCat seems to be keen on not only integrating the two, but dismissing existing scholarship which they found "wrong", such as the recent replacements of *źombos with *źambas. This "my way or the highway" attitude is troubling me. --Ivan Štambuk (talk) 03:35, 16 October 2013 (UTC)[reply]
I can agree with your suggestion -- I don't have such a low idea of Wikipedians (or perhaps I don't have such a high idea of Acadmics, in my experience with them), but I'm perflectly OK with keeping published scholarship and Wiktionary OR apart. Now, is there a way to make it official? Put it up to a vote somewhere? I mean, in the absence of clear guidelines, it is difficult to do something officially about replacing published scolarship with one's own views. (In Wikipedia, that would violate NPOV, for instance, as all would agree.) Without official guidelines/policy, we run the risk of an edit war, say, between you and CodeCat on the ultimate fate of *źombos vs. *źambas. --Pereru (talk) 04:14, 16 October 2013 (UTC)[reply]
Well until such "official policy" is developed I think that the most important thing is to make a clear distinction between the OR content by Wiktionarians, and citable etymologies supported by scholarship. Otherwise etymologies will start to like our Old Prussian entries, where several POV-pushers have over the years created hundreds of articles combining both the actually attested Old Prussian (comprised of a few thousand words) with the Neo-Prussian conlang developed within some failed revival effort. That was of course all done solely to promote their fake language. Keeping them apart will enable either the completely separate treatment in the future, or some kind of unified approach with clear tagging what is OR and what is made up, whatever the community decides. --Ivan Štambuk (talk) 20:45, 16 October 2013 (UTC)[reply]
You've been fixating so much on "No Original Research" that you've forgotten another Wikipedia axiom: Assume Good Faith. It certainly looks to me like CodeCat's motivation has been strictly a desire to improve our coverage of Balto-Slavic etymology. You may disagree with her methods and you may consider some of her choices deeply flawed or just plain wrong, but she's not the POV vandal you're making her out to be. You have strong opinions about how things should be done, which is good, but be careful about assuming that they're so fundamental and self-evident that opposition to them has to be from ulterior motives.
I agree with the idea of marking Wiktionary-generated reconstructions and linking to an explanatory page- but can we make it a bit less like a wanted poster for Public Enemy Number One? We should have a small superscript attached inline to the reconstructed form in the etymology that says something neutral like noref and has a link to a page explaining in a matter-of-fact fashion how such reconstructions are different from those backed up by references. The appendix page itself can have a more prominent warning- though it's hard to compete with that masterpiece of overkill that we already have at the top of every entry for a reconstructed form. Chuck Entz (talk) 06:32, 16 October 2013 (UTC)[reply]
Yes WT:RFDO. I agree it's not ideal. Also if for this sort of thing we don't allow any original research, we are limited to just copying things out of reference books with no input of our own. Seems very un-Wikimedian to me. Mglovesfun (talk) 11:31, 16 October 2013 (UTC)[reply]
I don't think that their actions reflect good faith attitude. More like "I know everything, you're wrong, I don't care whom you cite." Personal attacks, lying, repeatedly misrepresenting sources, replacing cited reconstructions with that of their own invention, trivializing the purpose of discussions due to lack of policy...I don't see any good faith in that, only dirty tricks to promote their POV, i.e. a particular form of Balto-Slavic protolanguage, others be damned. But it's no problem, I can play that game too.
I see no point in giving the same amount of credibility to reconstructions fabricated by Wiktionarians and those devised and refined by established scholars. That noref tag would make it seem like there is a valid citation missing, not "this is OR". Highly speculative work needs to be conspicuously tagged and separated. --Ivan Štambuk (talk) 20:05, 16 October 2013 (UTC)[reply]

 

*źombos vs. *źambas again

I see some activity at *źambas, including a new tag marking it as original research. With this tag, would it be OK now to remove the RfV template? Do you guys agree that marking it as OR is enough?

There is also the question of who is going to draft a page outlining Wiktionarian OR with respect to etymologies. Any candidates? CodeCat? Ivan?

Also, if you don't mind, I'll be starting a vote shortly about whether or not reconstructed form pages should always have references. --Pereru (talk) 23:24, 18 October 2013 (UTC)[reply]

(Please notify me on the talkpage in the future, I've stumbled upon this randomly). Yes I agree that it should be removed from RfV (which apparently can't conclude), but OR banner must stay to make distinction between OR by Wiktionarians and established scholarship.
Nobody in particular has to draft a page, it should be a collective effort. I have several ideas, and have already mentioned tentative guidelines for making reconstructions in the last month's BP discussion. (but not on how they should marked, and whether such reconstructions should be mentioned in etymologies). --Ivan Štambuk (talk) 22:55, 19 October 2013 (UTC)[reply]
What good does the banner do if it doesn't even explain what OR is or why it's important? It's a Wikipedia concept and not a Wiktionary one. Our users, and even a lot of our editors I imagine, probably don't really understand what the meaning or significance of the template is, or why it's meaningful to place it on an entry. As many others have pointed out before, every entry has some OR in it, so the meaning of and need for the template has to be established before it can be used effectively. Otherwise it's nothing more than a single user's pet project / decoration / clutter. —CodeCat 23:00, 19 October 2013 (UTC)[reply]
But since reconstructed forms aren't words stricto sensu, the difference is an important one, i.e. a source external to Wiktionary vs. Wiktionary itself. I think the banner can be made clearer on what OR is here (i.e., a reconstruction proposed by a Wiktionarian).--Pereru (talk) 23:49, 19 October 2013 (UTC)[reply]
Inferring definitions from attestation is not creating new knowledge. Postulating reconstructions on the other hand is. Postulating reconstructions in the domain of normal lexicography would be equivalent to creating new words or their meanings, something usually done on the WT:LOP where many of the new words are proposed on the basis of combining established constituent morphemes. Furthermore, reconstructions are by definition equivalent to making statements 1) the listed reflexes are genetically related 2) their last common ancestor had such phonemic inventory 3) the specific reflexes changed in such-and-such way (which is very often irregular). There is lots of "hidden info" within any individual reconstruction.
If the editors or users don't understand the notion of original research, perhaps a guideline page is needed to describe it to them. I'd be happy to write one in necessary. I'm pretty sure they're even more ignorant of the message displayed by the {{reconstructed}} template, of which we haven't had any complaints to my knowledge.
I can imagine that you're feeling disgruntled by such banner - but the problem is that many of your reconstructions (as well as proto-language inflections and definitions) are speculative in character, and I think it's pretty important to separate what is generally established in the literature (i.e. the communis opinio of etymologists) from what is Wiktionarians' guesswork. Once again, I have nothing against particular reconstructions, and I think that *źambas is perfectly plausible, but it's the principle that matters, otherwise if no distinction is made there could be no way to separate internal and external etymologies which would make the project's etymologies completely useless. I have spent lot of time compiling etymologies from various sources, I'd be very sad if that were for nothing. --Ivan Štambuk (talk) 00:48, 20 October 2013 (UTC)[reply]

Moved from RFV:

Can we get a reference for this reconstruction (perhaps Kim)? Or else, I suppose it will have to go. --Pereru (talk) 09:54, 7 November 2013 (UTC)[reply]

If it helps, Frederik Kortlandt says in Baltica & Balto-Slavica: My first example is the reconstruction of the Proto-Baltic demonstrative pronoun on the basis of the Old Prussian evidence. Van Wijk has argued that Prussian stas is a contamination of *sa and *tas (1918a: 111). In Proto-Indo-European, the pronominal sterm *to- was in complementary distribution with the nom.sg. forms masc. *so, fem. *. As in Lithuanian, the oblique cases gave rise to competing nom.sg. forms masc. *tas, fem. *. [] Turning around the evidence, we can conclude from the creation of Old Prussian stas that the suppletive nom.sg. forms *so and * had been preserved in early Prussian and, consequently, in the Balto-Slavic proto-language. The contamination of *sa and *tas cannot have been very recent because initial st- pread to the adverbial forms stwi, stwen, stwendau "(from) there". On the other hand, the creation of *tas was probably posterior to the disintegration of the Balto-Slavic proto-language.
He says something to the same effect in Demonstrative Pronouns in Balto-Slavic, Armenian, and Tocharian. - -sche (discuss) 17:14, 7 November 2013 (UTC)[reply]
Hmm... he doesn't label the forms *tas', * as either Proto-Baltic or Proto-Balto-Slavic. Judging by most of Kortlandt's research, Proto-Balto-Slavic is probably what he meant, but he didn't say so. To me, this would be at best very iffy as a source. Isn't there a better one? I mean, the person who started *tas (CodeCat) read about it somewhere, right? Where was it? --Pereru (talk) 13:14, 9 November 2013 (UTC)[reply]
  • Delete per Kortlandt. This Proto-Balto-Slavic which "conveniently" ignores Old Prussian (the most archaic branch) reminds of PIE reconstructions which "conveniently" ignore Anatolian languages (also the most archaic branch). --Ivan Štambuk (talk) 13:28, 9 November 2013 (UTC)[reply]

I think the general structure of this name is clear. It derives from two Germanic elements, the second of which is *wīgą. But I can't quite figure out the first. It seems to be *hlūdaz (loud), but at the same time there are also many forms that have -o-, which would point to either *hluda- or *hlōda-, neither of which I recognise. It does seem that regardless of the origin, the name has escaped at least some of the sound laws in the various languages. Dutch -k is irregular (found also in other names with this element), and German -d- points to earlier -þ-, because an original -d- would give -t- in German. But I don't know if there are any sources old enough to show -þ- if it was present. German -u- and Dutch -o- can only correspond to an original short -u- in Germanic, but that just raises the question why the German form doesn't also have -o- (like in many other words) because that's what the Latin shows too. Dutch u > o only happened in the 12th century, so in the 7 centuries between the time of Clovis and that sound change, both languages must have had -u-, which makes the Latin form even stranger. That is, unless u > o was part of the borrowing. An original long -ō- might work, but then the expected form in Old High German would have to have -uo-, and I don't know if it does. So it's a bit of a puzzle. —CodeCat 21:20, 2 March 2014 (UTC)[reply]

For whatever it's worth, the Ludwigslied mentions "HLUDUICO REGE FILIO HLUDUICI" in Latin and "hluduig" in Rhine Franconian Old High German. - -sche (discuss) 21:55, 2 March 2014 (UTC)[reply]
That rules out -uo- then. It could be -ū-, but then you'd expect it to diphthongise in modern German, giving *Laudwig (or if -d- is original, *Lautwig). -ū- would also not be borrowed into Latin as -o-, presumably. —CodeCat 22:12, 2 March 2014 (UTC)[reply]
Ludwig could be a reborrowing from Latin Ludovicus, though; that would explain the lack of diphthongization in the second syllable as well (a thoroughly native form would be expected to be *Lautweig). If *hluda- is real, it could come from the PIE zero-grade *ḱlutós, but that's otherwise unattested in Germanic. —Aɴɢʀ (talk) 20:59, 3 March 2014 (UTC)[reply]
Names are different though... they often contain elements that are not found anywhere else in a language. I wouldn't be surprised if name-giving follows its own kind of vocabulary, which may be very archaic. In any case, I think that there may have been a lot of cross-borrowing among these names. The Dutch -k is probably borrowed from Romance/Latin. The final -wig of German doesn't have to be borrowed, because its short vowel could predate the diphthongisation. After all, the common element -ric was shortened too, in Dutch as well. —CodeCat 21:15, 3 March 2014 (UTC)[reply]
More old attestations of the name: the Straßburger Eide (from 842) have "Lodhuuic[us]" in Latin, and "Lodhuuig" in Old French. - -sche (discuss) 20:50, 4 March 2014 (UTC)[reply]
There's also Ludhuwīge in the Old High German part. I think it's interesting that the same name is written with -o- in Latin/French and -u- in OHG. I also notice that the initial h- is gone by this time. The -dh- also shows up in bruodher, scadhen, werdhēn, all of which had -þ- in Germanic and still have -d- in German. So it probably stands for [ð]. It also appears in several of the Old French words, though, so I wonder if French had [ð] at the time too? —CodeCat 21:11, 4 March 2014 (UTC)[reply]
You say the h was gone "by this time", but the h-having Ludwigslied (881) was written after the h-less Straßburger Eide (842). - -sche (discuss) 21:24, 4 March 2014 (UTC)[reply]
But was it also written in the same place? —CodeCat 21:49, 4 March 2014 (UTC)[reply]

r and rj in PSl.

I was just adding some Proto-Slavic etymologies, including *ǫgorь, and was wondering how certain we can be that it wasn't *ǫgorjь. Do any of its reflexes prove it has to have ended in and not rjь? —Aɴɢʀ (talk) 21:02, 3 March 2014 (UTC)[reply]

I don't know about Serbo-Croatian, but Slovene does preserve -rj- medially (morje) so if there is a Slovene descendant, you'd expect ogor as the nominative, but either ogora or ogorja as the genitive. —CodeCat 21:18, 3 March 2014 (UTC)[reply]
SP gives the latter: link So that settles it in favour of -rj-. —CodeCat 21:23, 3 March 2014 (UTC)[reply]

Where did the υ come from? --WikiTiki89 20:26, 4 March 2014 (UTC)[reply]

This word has two long ī's, and the regular development of that would be i in Romance. But some of the descendants have e, and French has something different altogether. Does anyone know what might have caused this? Was the first ī shortened at some point? —CodeCat 01:40, 9 March 2014 (UTC)[reply]

As for French, isn't "oi" the regular outcome for short "i"? I believe veisin is attested for Old French, before it became voisin. Chuck Entz (talk) 02:52, 9 March 2014 (UTC)[reply]
I know that it's the regular outcome of ẹ (from ē and i) in stressed syllables. I don't know about unstressed ones, because in those syllables close and open mid-vowels merged in early Romance. At least, I'm not aware of any language that shows any kind of distinction between them. —CodeCat 03:20, 9 March 2014 (UTC)[reply]

bombus as Latin term for bumblebee

An Italian IP a couple of hours ago added (diff) a translation to bumblebee of Latin bombus. It's true that the scientific name of the genus is Bombus, but according to all the Latin references I have handy, Latin bombus refers to the buzzing- not the bee. The scientific name was first published by w:Pierre André Latreille in 1802, but I'm not sure where he got it from.

My question: did Latin bombus ever refer to bumblebees (I'm not sure the scientific name counts, since it's translingual)? Italian seems to be one of only a couple of Romance languages that have a descendant as the term for bumblebees. Is there any evidence that it had that meaning in Vulgar Latin or Medieval Latin? Chuck Entz (talk) 02:34, 9 March 2014 (UTC)[reply]

Not in A Glossary of Later Latin to 600 A.D. (Souter). DCDuring TALK 03:42, 9 March 2014 (UTC)[reply]

Finnish -lainen and cognates

The etymology that's currently in the article does seem fine at first glance. But the word laji itself is a relatively recent loanword. Not just because the etymology says so, but because it contains ji which wasn't actually permitted in Proto-Finnic; it was simplified to plain i (seen in the inflection of veli, and also the etymology of voi and many others). The suffix is also widespread in all Finnic languages, from Võro in the south to Veps in the east to Finnish in the northwest. So what I don't get is how a derived form of a word that was supposedly borrowed from Old Swedish - implicitly not older than the late middle ages - managed to become so widely used even in places where there has never been Swedish rule or any other close cultural contact with Finland. It hardly seems plausible that all Finnic-language speakers could have collectively decided in the last 500 years to add -inen to laji and then start forming hundreds of words with that combination. —CodeCat 04:14, 9 March 2014 (UTC)[reply]

Could it be that (at least in some of the languages) there are actually several homographic, partially synonymous suffixes with different origins, like the different -ers and -lys in English? References agree that Finnish -lainen derives from laji, but could some or all instances of Estonian -lane/-line, Livonian -li, etc have a different, j-less origin?
Alternatively, ... nature abhors a vacuum, and languages abhor lexical gaps (and like new words even when they don't fill lexical gaps). Once Finnish developed a useful suffix -lainen, it doesn't seem that implausible to me that other Finnic languages could have taken it up. It would be interesting, but probably difficult, to find data on the earliest uses of the suffix in various languages.
Johanna Laakso's Throwing a Glance at heittää contains this passage: "In many cases of grammaticalization, semantic abstraction or generalization of meaning in Finnic, the lexeme in question is a loanword. Good examples are the Finnish postposition kanssa ‘with, in the company of’, with cognates including the Estonian comitative case suffix -ga, which goes back to the Germanic loanword kansa ‘people, company / companion’, and the Finnish suffix -lainen ‘of an X kind, like X’, from laji ‘sort, type, species’ (from Swedish slag; cf. Laitinen & Lehtinen 1997)[. ...] The question arises whether loanwords are more prone to be grammaticalized — perhaps they [...] lend themselves more easily to novel uses?"
- -sche (discuss) 05:00, 12 March 2014 (UTC)[reply]
(following up on my point about earliest uses of the suffix in various languages) I did not make an exhaustive study, but in the first four places in Matthew where the 1938 Finnish Bible has -lainen, the 1548 Bible has other constructions, suggesting that -lainen was not as common then as now:
1938 1548
2:2 Missä on se äsken syntynyt juutalaisten kuningas? Cussa ombi se esken syndynyt Judain Kuningas?
2:23 Hän on kutsuttava Nasaretilaiseksi Henen pite Nazareus cutzuttaman
4:24 Ja maine hänestä levisi koko Syyriaan, ja hänen luoksensa tuotiin kaikki sairastavaiset, monenlaisten tautien ja vaivojen rasittamat, riivatut, Ja he toijt henen tygens caikinaijset saijrat, moninaisista taudheista ia kiwuista kiewretudh, ia ne piruldariuatut,
5:13 mutta jos suola käy mauttomaksi, millä se saadaan suolaiseksi? Jos nyt sola tule maguttomaxi, mille se solatan?
- -sche (discuss) 05:37, 12 March 2014 (UTC)[reply]
That is interesting. Judain is an archaic genitive plural, the second line just uses the loanword Nazareus directly, the third seems to use moninainen instead of monenlainen, and the fourth replaces the adjective suolainen with the verb suolata. This isn't direct proof that the suffix wasn't in use at the time, but it does seem that there were plenty of ways around it. (The old text has a few other archaisms that I like too... ombi > on shows the 3sg ending -bi in its original form, oo > uo hadn't happened yet, and maguttomaxi still shows the weak-grade -g- that vanished later.)
Estonian -line is not a cognate, its Finnish cognate is -llinen, and maybe the same applies to the Livonian. What puzzles me most though is that even Veps has a cognate, even though Veps is relatively isolated from the remaining Finnic languages (see map) and was never part of Sweden as far as I know. The only cognate I can find on this short word list is the endonym vepsläine, which could easily have been an exonym originally, borrowed from another Finnic language. But Veps Wikipedia (yes, it exists!) has a few more, so maybe it has become productive there too. It's still a bit of a puzzle. Could combined Swedish-Finnish influence really have been so strong that it caused this single suffix to spread so far and become so productive in just a few centuries? —CodeCat 14:15, 12 March 2014 (UTC)[reply]
I've managed to find some more early uses.
The Finnish Bible of 1548 uses -lainen after all; Mark 7.26 has a Grekilainen, Luke 4.27 has a Syrialainen, and Luke 17.18 and 24.18 have Mucalainen and mwcalainen.
The fragmentary Estonian Bible of 1632 says in Matt 22.44 "istuta hendas minnu parramball käghel, senni minna panne sinnu wainlasset, üttes penckis sinnu jalla alla" ("sit at my right hand until I put your enemies under your feet"). The Bible of 1686 has, in Hebrews 1.13, "Istu minno hähle Käele / senni kui minna panne sinno Wainlaisi sinno Jalgu Allutzesz?". The same verse in the Bible of 1739 reads "istu minno parrema pole, kunni ma saan pannud so waenlased sinno jalge alluseks järjeks".
A German work on Võro from 1864 mentions the words mustlane (?) and wenelāne ("Russian"), and a work from 1870 mentions the Votic word venālainē ("Russian").
- -sche (discuss) 04:21, 13 March 2014 (UTC)[reply]
As sche guesses above, there are indeed two homonymous suffixes involved here (cf. e.g. Lauri Hakulinen, Suomen kielen rakenne ja kehitys). "Ethnonymic" -lainen is from the toponymic -la + -inen, and according to Hakulinen, probably was originally extracted from words like karjalainen (where -la is a part of the stem). This suffix abides to vowel harmony, may have fossilized semantics, derives nouns, and is added to the bare stem; whereas similative -lainen does not do either, derives adjectives, and is affixed to the genitive, e.g. käsky (order) > käsky-läinen (underling) vs. (nonce form) käsky-n-lainen (order-like).
Words suffixed with "old" -lAinen include, in addition to ethnonyms (from SKRK):
apulainen, eläkeläinen, huutolainen, jälkeläinen, jättiläinen, kansalainen, karkulainen, kaupunkilainen, kerholainen, kerjäläinen, kotolainen, koululainen, (seura/palo)kuntalainen, kyyhkyläinen, käskyläinen, käypäläinen, köyhäläinen, liittolainen, maalainen, matkalainen, mehiläinen, metsäläinen, paholainen, raakalainen, rautatieläinen, saarelainen, sukulainen, syöpäläinen, tehtaalainen, tuholainen, työläinen, vainolainen, viholainen, yöläinen.
In summary: on our currently listed entry for -lainen, senses 1-3 belong under one etymology, while senses 4-5 belong under another. --Tropylium (talk) 03:51, 21 March 2014 (UTC)[reply]
…and on closer look it appears that the "lumpenetymology" was added by me a couple of years ago. My bad, guys. >_> --Tropylium (talk)
@Tropylium Thank you for explaining that. I've adjusted the entry accordingly. Could you see if it's all ok, and add references if you have any? Also, what do you know about the n-s alternation in words like these? —CodeCat 13:50, 21 March 2014 (UTC)[reply]

RFV of the etymology. Tagged but not listed. More specifically, the Proto-Indo-European portion of the Proto-Germanic section:

At the very least, the notation would seem to be incompatible with the way we normally do Proto-Indo-European, and I'm not sure about the spider part of it, either. The same information is also to be found at *rukkô, and perhaps elsewhere. Chuck Entz (talk) 21:19, 9 March 2014 (UTC)[reply]

All the redlinks, the absence of important gloss information for the terms in the Etymology, and the absence of any (possibly obsolete) senses in the English definitions that closely relate to the few glosses that are in the Etymology make it impossible to accept the plausibility, let alone the likelihood, of the etymology given. This is the kind of etymology that should be hidden until it is ready for prime time. DCDuring TALK 23:07, 9 March 2014 (UTC)[reply]
The PIE form should be *rukn-/*rukk- or *rəkn- with the meaning of "weaving, web" according to several sources. Leasnam (talk) 00:40, 21 March 2014 (UTC)[reply]
PIE didn't have geminated obstruents, as far as I know. For example *h₁es-si was simplified to *h₁esi. And *ə is not a PIE phoneme either, so it needs investigating. Furthermore, roots generally have an e-grade, so we would also need to figure out what it is. —CodeCat 00:43, 21 March 2014 (UTC)[reply]
I don't think PIE roots had obstruents before sonorants either, so an *-n- after the *-k- would have to belong to a suffix rather than the root. But ἀράχνη (arákhnē) probably isn't of Indo-Europan origin anyway. Watkins lists a Celtic and Germanic root *ruk- but doesn't take it back to PIE. —Aɴɢʀ (talk) 14:10, 21 March 2014 (UTC)[reply]

We currently have an undefined sense at give (sense 14) - namely, the "give" in "what gives?"

I've heard suggestions previously that, for example, "how's it going" comes from a literal translation of the German "wie geht's". Could "what gives?" be the same - that is to say, a calque of the German "was gibt's?" (or a Yiddish equivalent, possibly)? If it is, then it's idiomatic, and we don't have to claim that "gives" means anything special outside of this set phrase. If it's not, then I can't see any clear logic behind this phrase, or what the "gives" is doing. Smurrayinchester (talk) 13:17, 21 March 2014 (UTC)[reply]

That seems quite plausible to me. I heard the expression a great deal from my German-born father and not much from others. [[what gives]] could use some citations, BTW. DCDuring TALK 09:20, 27 March 2014 (UTC)[reply]
If "what gives" is the only place the above-mentioned sense of "give" is used, I would delete it even if "what gives" were not a calque: words can be used in unusual ways in idiomatic constructions, and that's covered in the idiom entries, not in entries for each portion of the idiom. (We do not and should not have a sense at [[town]] to explain what it means in the idiom [[go to town]].) Borrowing from German + Yiddish is a popular theory. Stack Exchange has some data on the phrase. The American Heritage Dictionary of Idioms, Second Edition (2013) says "what gives? may derive from the German equivalent, Was gibt's? Slang from about 1940, it is also used to mean 'how are you'[.]" The earliest use is said to be in John O'Hara's 1940 novel Pal Joey. - -sche (discuss) 09:50, 27 March 2014 (UTC)[reply]
Here is a discussion. DCDuring TALK 10:16, 27 March 2014 (UTC)[reply]
I included that in the entry, but now I wonder if it's so fringe that it should be removed. He considers the possibility that this use of give lurked in English, unattested, since proto-Germanic times, but excludes calquing from German as unlikely because English wouldn't have borrowed from German in the years prior to this term's attestation. He ignores that it could have been calqued from German and then lurked unattested for just a few decades. He also states that Yiddish lacks a "what gives"/"was gibt's" construction; @Wikitiki89, Metaknowledge do you happen to know if that's true or not? - -sche (discuss) 19:01, 28 March 2014 (UTC)[reply]
1. Pinging only works if you sign your name with --~~~~ in the same edit, so I was not pinged. 2. I don't know enough Yiddish to confirm or deny that statement, and I doubt that User:Metaknowledge does either (but hopefully my ping will get his attention in case he does). --WikiTiki89 19:11, 28 March 2014 (UTC)[reply]

How did Proto-Balto-Slavic *degtei become Proto-Slavic *žeťi? Shouldn't it have become ×deťi? *žeťi ought to come from a *gegtei instead. —Aɴɢʀ (talk) 15:58, 22 March 2014 (UTC)[reply]

I have no idea. Derksen's dictionary of Proto-Slavic says: "Most probably from *dʰegʷʰ- > *geg- as a result of assimilation.". —CodeCat 16:27, 22 March 2014 (UTC)[reply]

RFV of the etymology because it contradicts with the Online Etymology Dictionary. --kc_kennylau (talk) 00:53, 25 March 2014 (UTC)[reply]

For what it's worth, which may be little, WP says this in w:Substituent#Nomenclature:
"The suffix -yl is used in organic chemistry to form names of radicals, either separate or chemically bonded parts of molecules. It can be traced back to the old name of methanol, "methylene" (coined from Greek words methy = "wine" and hȳlē = "wood"), which became shortened to "methyl" in compound names. Several reforms of chemical chemical nomenclature eventually generalized the use of the suffix to other organic substituents."
- -sche (discuss) 01:45, 25 March 2014 (UTC)[reply]

I'm curious about the etymology of Arabic بَرْنَامِجٌ (barnāmij, program, show). It does not seem like a native Arabic word, since it has 5 root letters which is very rare for native words. --WikiTiki89 23:30, 25 March 2014 (UTC)[reply]

I've got a new grammar book, which says something about the etymology of this word. I will check it later, if I can find that page again. --Anatoli (обсудить/вклад) 23:53, 25 March 2014 (UTC)[reply]
It is from Persian برنامه (bar-nâmeh) (a sheet of paper used to tally up numbers), from بر (bar) + نامه (nâmeh), from Pahlavi Persian نامگ (nāmag). The Pahlavi گ produced the Arabic ج. The plural of برنامج is برامج (barāmij), the ن being dropped because broken plurals do not permit so many consonants. Related words: مبرمج (programmer), برمجة (programming). —Stephen (Talk) 00:08, 26 March 2014 (UTC)[reply]
Thanks, Stephen! I've made a simple entry, please someone add etymology and fix the entry otherwise. --Anatoli (обсудить/вклад) 00:19, 26 March 2014 (UTC)[reply]
Just to add to that, -k' (-ag) is a Middle Persian suffix which became -a and later -e (ـه) in New Persian, and it is used in many words. All Arabic nouns ending in -aj that I know are from Middle Persian. I made some changes to برنامج and برنامه. --Z 10:20, 26 March 2014 (UTC)[reply]
Counterexample: حِجَجٌ (ḥijaj). But I'm sure there is some way to rephrase that to make it true. --WikiTiki89 10:39, 26 March 2014 (UTC)[reply]
Yes, better to say, "strange" words (possible loanwords) which end in -aj (this one has ح () so it's Arabic or Semitic at least), and also -aq which I forgot to mention: the Middle Persian suffix was pronounced as -ak in early form of the language, examples are خندق (xandaq), بيدق (baydaq) and باذق (baadhaq). --Z 11:05, 26 March 2014 (UTC)[reply]

hazelnut: φουντούκι, fındık,

Various sources cite the Greek as derived from the Turkish, others the Turkish derived from the Greek. What is the true etymology? Is it mere coincidence that "fındık" seems a straightforward transliteration of Arabic "funduq" (inn, han, caravanserai)? --Larrynstout

The Turkish word is from Ancient Greek, sources often simply call it "Greek". Full etymology of Modern Greek φουντούκι is: Ottoman Turkish فندق (funduq) < Persian فندق (fondoq /funduq/) < Arabic فندق (funduq) [< Persian فندک (fondok)] < Middle Persian pndk' (pondik) < Ancient Greek ποντικόν κάρυον (pontikón káruon). --Z 12:03, 28 March 2014 (UTC)[reply]
And from which language is Russian фунду́к (fundúk) borrowed, our entry says from Turkish, but that doesn't make much sense to me in terms of phonetics (if it came from Turkish fındık, it should have been *фындык (fyndyk))? --WikiTiki89 18:30, 28 March 2014 (UTC)[reply]
OTOH, consider Баку (Baku) from Azeri Bakı. —Aɴɢʀ (talk) 19:01, 28 March 2014 (UTC)[reply]
According to Vasmer and Shansky, Russian фундук (funduk) is from Crimean Tatar funduk. I only find Crimean Tatar fındıq, but I don't have good resources on this language. Note also Turkish dialectal (Trabzon, Rize) funduk. As for Баку (Baku), bear in mind that until recently the city was mainly populated by civilized urban peoples—Persians, Armenians and Jews. Turkic-speakers appeared later. So the Russian may reflect Persian باکو (Bāku) or Armenian Բաքու (Bakʻu). --Vahag (talk) 19:30, 28 March 2014 (UTC)[reply]
As ZxxZxxZ already said, the ultimate etimon of φουντούκι and fındık is Ancient Greek Ποντικόν κάρυον (Pontikón káruon, hazelnut). The homonymous Arabic فندق (funduq, inn, hotel) is ultimately from Ancient Greek πανδοκεῖον (pandokeîon, inn). Both words have spread through the Orient. --Vahag (talk) 22:27, 28 March 2014 (UTC)[reply]