Wiktionary talk:Votes/2021-09/New standard for archived quotations

From Wiktionary, the free dictionary
Latest comment: 2 years ago by Fytcha in topic Provide examples
Jump to navigation Jump to search

A plea to those on the fence[edit]

My impression from the discussions on this page and elsewhere is that there's plenty of dissatisfaction with the current policy, at least as written, but a lot of people also feel my proposal isn't quite what's needed. Ultimately no consensus emerged about how to change the proposal text, so I haven't edited it. My request is, if you agree that this proposal is at least better than what's currently on WT:ATTEST, then please vote for it. If nothing else, updating the policy will get people's attention and create momentum for further change. Organizational policy has a way of being stubborn. On the other hand, if you think the proposal would on net make things worse, then you should vote against it, of course, but consider making your own proposal for another attempt at fixing this situation. Many of you have a better sense of what the Wiktionary community wants than I do. —Kodiologist (t) 17:19, 29 September 2021 (UTC)Reply

Instead of breaking the system out of frustration and relying on the supermajority rule to ensure it never gets fixed, withdraw the vote and propose one that addresses only the subject of internet slang which is what got you started. That would be an addition to the CFI stating
A sense with the label {{lb|en|Internet slang}} may be kept if [some standard that is stricter than "three people have used it on the internet" but less strict than "three professionally edited books or print journals have used it"].
Vox Sciurorum (talk) 17:30, 29 September 2021 (UTC)Reply
I'm not advocating for anybody to break any systems here. I don't know what "standard that is stricter than 'three people have used it on the internet'" you have in mind (like I said, there's a lot of interest in some stricter standard but little agreement of what it would be), and this isn't really only an issue for Internet slang. —Kodiologist (t) 11:56, 30 September 2021 (UTC)Reply

Title[edit]

@Kodiologist, this vote has an extremely misleading title. Rather than clarifying policy, it would radically change it. —Μετάknowledgediscuss/deeds 21:42, 18 September 2021 (UTC)Reply

What would you like to title it? —Kodiologist (talk) 23:06, 18 September 2021 (UTC)Reply
@Kodiologist: That's up to you; I'm unlikely to support this vote regardless. If you want a suggestion, how about "New standard for archived quotations"? —Μετάknowledgediscuss/deeds 00:38, 19 September 2021 (UTC)Reply
No problem. Done. —Kodiologist (talk) 01:17, 19 September 2021 (UTC)Reply
@Metaknowledge: Look, I know I am retarded, but there is a convolute (aye, we lack a noun entry) of problems found in the last decade or half, and in so much desires have been tallied to radical solutions. In other words I esteem that from any angle it is desired and desirable for the current CFI verbiage to be swept away for a new one that is more conscious of where there is elbow room and where not. Therefore like him I have a radical proposal and with less illusion about its radicality I found a voting longer than four weeks apposite to either vote.
Mayhaps my proposal is even so genius that @Geographyinitiative—whose opinion of either formulation I am curious to hear!—will not vex us anymore in requests of verification. I have not aimed at toponyms because of their nature being peculiar to that of common nouns, so there is room for specific votes about personal and place names.
For example, my formulation still asks for “use in permanently recorded media in at least three independent instances spanning at least a year is apposite” but with the restriction “unless the varietal nature of the lexical item suggests that the item surface differently” which he could interpret to mean for him that the second and third quote might be a secret military map he cannot find but just assume. (Our CFI are practically a bizarre amassment of memes which can only be explained by links to discussions. Therefore we have talk about things like secret military maps which must estrange the outsider, not only because some editors are dotty. The hyperbole is reality.) Fay Freak (talk) 02:56, 19 September 2021 (UTC)Reply
I will follow the rules, as written, as implied, as actually implemented, fair/unfair, reasonable/unreasonable, etc. I did not intend to vex others, but instead to hone my own awareness of the truth of what is needed to lock down my section of Wiktionary so it is a goldmine for the readers rather than a mudhole. --Geographyinitiative (talk) 03:01, 19 September 2021 (UTC)Reply
@Geographyinitiative: But unfair and unreasonable rules are problem, sweetie. How far must we have gone that rules are followed without reason? Rules have a telos to which their application must be directed to bring about their effet utile.
(As sidenote, English law is bad and avoided because of their surprising literalist interpretations of contracts, and it is absurd that fairness or Treu und Glauben must be specifically added to international contracts so their application is not fiat iustitia et pereat mundus. In my country, unlike the US or PR China, laws are not by default unfair.) Fay Freak (talk) 03:42, 19 September 2021 (UTC)Reply
It is not to me to make the rules, it is to me to use the rules to make mainspace edits. --Geographyinitiative (talk) 03:49, 19 September 2021 (UTC)Reply
@Geographyinitiative: But now we are making them and you have been asked, and you will have the power of voting. Is there no judgment? No rules that can disconcert you? Fay Freak (talk) 03:56, 19 September 2021 (UTC)Reply
I don't want to be rude to you and I thank you for your comments on the discussion about archive standards, etc. I just have to really stay out of policy discussions generally. I just think the subject matter of what that vote is trying to change is too sacrosanct for me to interfere. I don't want to go down historically in Wiktionary as watering down the standards. Whatever happens is okay and I will roll with it. I like Internet Archive, I like books, I like unarchivable webpages, I will just go with whatever the site is doing. It is all fun. --Geographyinitiative (talk) 03:28, 20 September 2021 (UTC)Reply
@Geographyinitiative: I don’t think what happens here is okay and it is not true that whatever happens is okay. People are gaming the system by applying the supposedly “sacrosanct” rules, including you by asking the question of “attestation” too loudly, which is not an actual scientific question that should be pursued, because it is only pursued with inadequate resources, but at this point a private language chimera devoid of providing meaning. I hate these left-field “verification” discussions and most people bar stirrers hate them so we have to formulate rules in a way discussion is not needed or at least not hated. The “rules” derived from WT:ATTEST are not sacrosanct, they are a continued history of abominations, and surely not rules either—for as I said only derivations, the formulations having been too rough to actually convince anyone. You could say it is already repealed by custom, but replaced only by some intersecting understanding of the mob. Fay Freak (talk) 13:36, 22 September 2021 (UTC)Reply

Too Ambitious[edit]

I think this vote is pretty good, aside from some awkward/unnecessary wording ("Internet sources are not required to be formally published, because informal language is within the remit of Wiktionary." is unnecessary and "A Wiktionary editor must not cite their own utterances." might be better put more simply: "Wiktionary editors must not cite themselves." However, I doubt this vote will pass, since it's trying to do too much, without any real way of ensuring quality control. I would recommend simplifying the vote so that it allows reputable sites with material published online (like news sites, online encyclopedias, etc.). If that passes, we can start thinking about allowing social media posts and the like. Andrew Sheedy (talk) 04:10, 19 September 2021 (UTC)Reply

Anyway why shouldn’t one cite oneself—if one has published a book? If one editor can cite me than myself I can too. Fay Freak (talk) 04:12, 19 September 2021 (UTC)Reply
But informal language is within the remit of Wiktionary. If we restricted ourselves to "reputable sites" with "quality control" or other "reliable sources", we wouldn't be able to capture informal langauge in its natural habitat. This is why Wiktionary has always allowed Usenet posts and books of any sort, because Usenet randos and trashy bargain-bin romance novels are no less valid as examples of the use of a living language (like English) than The New York Times. To be sure, you should be wary about claims of fact made in these sources, which is why Wikipedia has a reliable-source policy for citations to support claims of fact. But these sources are perfectly legitimate as examples of usage, because they exemplify how language is actually used. (I hope you agree that these points show that the sentence "Internet sources are not required to be formally published, because informal language is within the remit of Wiktionary." is necessary.) —Kodiologist (t) 12:39, 19 September 2021 (UTC)Reply
In case it's helpful, here's a specific example where I think it's important to cite an informal source. We happen to know the exact utterance in which the slang phrase "on fleek" was coined: the sentence "Eyebrows on fleek, the fuck." in a 2014 Vine video. The video is currenly mentioned under "Etymology", but getting the earliest possible citation for a word or sense is important. We should cite that video and it should count towards attestation. —Kodiologist (t) 01:14, 20 September 2021 (UTC)Reply
It's statements like this that make me suspect that you don't understand what Wiktionary's quotations are even for. We can quote anything, and if you want to add that Vine video as a quote in the entry, go ahead! The distinction is that such a quote could not currently be used to count toward keeping the entry at RFV. —Μετάknowledgediscuss/deeds 17:18, 20 September 2021 (UTC)Reply
On the contrary, I'm saying that such a quote should "be used to count toward keeping the entry at RFV". That's what I mean by "it should count towards attestation". Is that not what Wiktionary's quotations are for? —Kodiologist (t) 17:36, 20 September 2021 (UTC)Reply
You said that it was important because it was early, not because it provided attestation. Illustrating usage and attestation are the two distinct purposes of quotations, and you have conflated them. —Μετάknowledgediscuss/deeds 20:02, 20 September 2021 (UTC)Reply
I suppose earliness has some value for illustrating usage, but I think it's more important for attestation, because earlier citations attest to the sense in question existing earlier. —Kodiologist (t) 20:24, 20 September 2021 (UTC)Reply
"Wiktionary editors must not cite themselves" sounds better, but I was afraid it might be taken to forbid e.g. a Wiktionary editor quoting a chapter of a scholarly book he edited in which each chapter is by a different author. In this case, the editor is quoting his own work but not his own utterance. Let me know if you think the wording "Wiktionary editors must not cite themselves" clearly allows this sort of thing, or if, contrariwise, you don't think it should be allowed. —Kodiologist (t) 12:39, 19 September 2021 (UTC)Reply
One comment I would make about Usenet is that it is nowadays a pretty obscure corner of the Internet that fewer and fewer people know about or know how to search. Now we are talking about potentially anyone typing anything into Google search and being able to cite whatever comes up, which is a rather different ball game, it seems to me. Mihia (talk) 20:49, 19 September 2021 (UTC)Reply
Again, Kodiologist, I'm not saying your proposals are bad, just that you're trying to make too many changes at once. Let's just expand our citation horizons a little bit, to a point that everyone can agree on, and worry about expanding them further in a future vote. I'm all for allowing a greater variety of Internet sources, but people have such a variety of opinions on this subject that you're not going to get much support. If you introduce 1 change, you have a chance of passing it. If you introduce 5, your chances are much lower, because different people will oppose for different reasons and fewer people will support all the changes. Andrew Sheedy (talk) 16:19, 24 September 2021 (UTC)Reply
@Andrew Sheedy: I don't understand. There's only one key change here, which is to make it explicit that web pages are allowed. The rest of the proposal is just wording adjustments and a rule about not quoting yourself, which I've added specifically to support using web pages; I hope you agree that the proposal has less, not more, of a chance of passing without it. Like I've said elsewhere on this page, you and others have voiced a desire for rules about "quality control" or "reliable sources" or a "manhole cover", but there is little agreement about what form that should take, particularly in a way that would be consistent with our anything-goes policy for Usenet and books, and with our goal to document slang usage. So, I don't know what you would expect me to propose along those lines. Am I misinterpreting you? —Kodiologist (t) 17:59, 24 September 2021 (UTC)Reply
Picking up on your "making it explicit" point, I wonder if I might step back slightly on a tangent, and ask something that I meant to raise earlier (or raise again, most probably, and apologies if I have forgotten the answer). As far as I can understand it, the letter of the current text at WT:ATTEST does not exclude attestation from websites, provided that they are "permanently recorded". If we are now saying that Internet Archive is "permanent enough", or as permanent as Usenet anyway, then what is actually preventing us from citing archived websites right now? Yet the very much dominant or prevailing opinion seems to be that presently we can't cite any old website, and once when I asked whether we could, the answer was "Noooooo!", as I recall. So where does this idea stem from, and where is it actually stated? Mihia (talk) 21:21, 24 September 2021 (UTC)Reply
Good questions. As a Wiktionary noob, I have no idea. Notice that I originally titled this proposal "Clarify archiving policy for attestation", but at least one person thinks my proposal is a "radical change" and not just a clarification of what was already insinuated by the text. Wiktionary:Searchable external archives, which is notably linked to from Wiktionary:Requests for verification/English, says "Websites are not considered durably archived", but it isn't an official policy page. —Kodiologist (t) 01:56, 25 September 2021 (UTC)Reply
The community consensus has been pretty clear, at least since I started editing here 7 or 8 years ago, that "durably archived" does not apply to anything on the Internet except Usenet. I don't agree that it should be limited to that, and I think anything on the Internet Archive should be fair game, but many people disagree with that and think that only a select number of cites should be permitted, at least for now. All this being said, I intend to support your vote, as I think it's high time we loosen our silly rules a bit and recognize that the Internet is a vast lexicographical minefield that we would be foolish to ignore entirely. Andrew Sheedy (talk) 20:18, 25 September 2021 (UTC)Reply
Great, thanks for your support. —Kodiologist (t) 13:07, 26 September 2021 (UTC)Reply

Which websites should be regarded as reliable sources for quotations?[edit]

In principle I support the idea that a reliable source that is archived by a reliable website such as the Internet Archive can be cited. However, I think that the proposal doesn’t adequately deal with a major issue, which is what sort of websites should be regarded as reliable sources for quotations in the first place. Anyone can submit a website, including one that they’ve created themselves, to the Internet Archive for archiving. What is to stop someone from creating a blog filled with neologisms, archiving it at the Internet Archive, and then claiming it can be quoted as evidence of use of the terms? — SGconlaw (talk) 07:10, 19 September 2021 (UTC)Reply

This. Which is the reason that in my draft and discussions of all the issue—as well as some other Wiktionarians did—I emphasized the collective mass of websites, the fact that an “internet word” should be included if it occurs with a sufficient perspiration across various sites (in a way that seems organic rather than a channer campaign or a ghost word caught on from Wikipedia inventing it in the first place), so that it does not matter if it is this or that individual website because for every deleted webpage there will be another one using the word. Fay Freak (talk) 08:38, 19 September 2021 (UTC)Reply
I’m not sure I’d go that far, though. Taking a more conservative approach, I’d prefer to exclude personal blogs and wikis, and include only well-established online-only content (for example, BuzzFeed, HuffPost, Issuu, Medium and TechCrunch) as reliable sources. Perhaps we need to establish a Reliable Sources discussion page like Wikipedia has. — SGconlaw (talk) 09:48, 19 September 2021 (UTC)Reply
@Sgconlaw: The transferrability of the reliability criterion of Wikipedia is surely chimerical. That is for content reliability, ours is editing standards so low that we can say “yes, this term is written in serious like that”. And yes, the Daily Stormer or some random Islamist blog is a reliable source. Because when I write it on Wiktionary this word has a different meaning than when I write it on Wikipedia. Apart from that—that the editors of Wiktionary decidedly don’t care that the online sources are “well-established” or “respectable” or “reputable” or the like, since they are decided to have the offensive terms, and printed books can also be of bad repute—you know we don’t have the manpower for such discussion pages (which already reek of political screening, for which these editors here aren’t reliable judges …). Fay Freak (talk) 10:17, 19 September 2021 (UTC)Reply
See above for why I don't think we should restrict to reliable sources (and why Wiktionary doesn't already, allowing e.g. random Usenet posts). "What is to stop someone from creating a blog filled with neologisms, archiving it at the Internet Archive, and then claiming it can be quoted as evidence of use of the terms?" — The last sentence of my proposal. If you can get other people to use your neologism in a fashion that meets all the criteria of WT:ATTEST (e.g., in "three independent instances spanning at least a year"), then I think that's totally legit and you can add it: your neologism has caught on enough to merit inclusion. —Kodiologist (t) 12:41, 19 September 2021 (UTC)Reply
  • I agree with SGconlaw above. Although I very much support allowing citations from "sensible" Internet content, the currently proposed wording seems to give complete carte blanche to any old rubbish, bad English or made-up nonsense that anyone can find (or create) somewhere on the Internet. I think this aspect needs to be tightened up -- certainly before I could vote support. On a couple of general wording issues:
I would delete the sentence "Internet sources are not required to be formally published, because informal language is within the remit of Wiktionary" as I don't see it as necessary.
I would mention print media first, and then Internet sources. Yes, I know that the order of the proposed new wording somewhat just reflects the existing wording at WT:ATTEST, but the existing wording, where Usenet Groups are given such prominence even over books, is simply bizarre, and really ought to be changed whatever the fate of this vote.
I have a pet hate of using "their" as is done in the last line of the proposed new text. May I advocate "A Wiktionary editor must not cite his or her own utterances." However, I cannot imagine how this rule would be enforced. How could we tell?
Mihia (talk) 19:57, 19 September 2021 (UTC)Reply
See above re: "sensible" sources and informal language. Mentioning Internet sources first seems appropriate to me since it's generally easier to find uses of a word on the Internet than in print. Singular "they" is shorter than "he or she" and has the added bonus of including nonbinary people. (Both of these latter two points are minor and I'd be happy to change them if I got your vote by so doing.) I readily admit that people may be able to successfully pass off their own work as someone else's, but it's still worth checking for and preventing as well as we can, the same way Wikipedia enforces w:WP:AUTOBIO, or else we're inviting self-promotion. —Kodiologist (t) 23:43, 19 September 2021 (UTC)Reply
I agree that we should include slang and informal English, but somewhere between this and the crap of e.g. Urban Dictionary a line should be drawn. I do not believe that the present wording achieves this. Mihia (talk) 23:50, 19 September 2021 (UTC)Reply
In that case, I don't know what kind of line you would want to draw. Like, I agree that we don't want to become Urban Dictionary, but we prevent that in part by requiring citations to real examples of use, in contrast to Urban Dictionary, where you can just assert that a word has a given meaning, no citations required. (In fact, making up your own words and definitions is an intended use of Urban Dictionary; it's not intended to only describe language that already exists, like the OED and Wiktionary.) —Kodiologist (t) 01:03, 20 September 2021 (UTC)Reply
TBH, I don't really know how to exactly define the line that I would like to draw. I wish I did, and then I would be able to propose something concretely. I just have a kind of horror, as I alluded to above, of people being able to type any old rubbish into Google search, have three instances come up, and then be able to add it to Wiktionary. Mihia (talk) 08:38, 20 September 2021 (UTC)Reply
If three instances come up and taken together they meet WT:CFI (amended as proposed) and the word and sense otherwise meet all other present inclusion requirements, I see no rational basis to omit the word. Documenting a living language requires overcoming any sense of discomfort or disgust we might have about some of its speakers and their speech. —Kodiologist (t) 11:43, 20 September 2021 (UTC)Reply
I may be just repeating myself here, but IMO:
  • We need a way to ensure that words have achieved "sufficently broad" currency, and are not made up and used by only a very small group of people, such as a group of friends all talking to each other. I'm not certain how we should measure breadth of currency. I don't think it is sufficient to count hits, since it may be the same people or person every time. Perhaps we should require multiple different websites or even types of website, or types of content anyway, e.g. requiring some examples from "editorial" content, and not all from forum posts, say.
  • We need a way to prevent inclusion of words or meanings that are just bad English (misunderstandings, misspellings, non-native rubbish, juvenile rubbish, etc.), notwithstanding that COMMON errors can be listed as such. Just to give a random example, I can readily find numerous instances of "noidea", whereas IMO clearly this should not be included. While we do have the "rare misspellings should be excluded" rule, I can't bear the thought of repeated arguments about whether the likes of "noidea" should be included because someone can produce three instances -- or even thirty instances. While this problem potentially already exists, it could easily become multiplied if we accept any old content from the Internet. If we could somehow require at least some citations from sufficiently "reliable" sources, we could more easily put paid to a lot of the crap.
Mihia (talk) 17:49, 23 September 2021 (UTC)Reply
(1) This should be covered by the preexisting "independence" rule. If two (or three, or a hundred) uses are all from the same handful of people, they're not independent. (2) Hmm, I see what you mean, but I'm not sure how much of a problem it is. The entries themselves would be minimal (e.g., "misspelling of 'no idea"). I would see it as kind of a waste of effort to add and cite a bunch of uninteresting misspellings to Wiktionary, but it should be harmless. If there was real time and effort put into this, it would actually become valuable, as a data source for a spellchecker with unusually high-quality human-selected suggestions. —Kodiologist (t) 20:16, 23 September 2021 (UTC)Reply
I'm not sure that the existing "independence" rule would cover it? It says "Roughly speaking, we generally consider two uses of a term to be 'independent' if they are in different sentences by different people", but three different people who are all part of a tiny in-group would not demonstrate broad currency. On the other front, while one opinion may be that it is harmless to have entries such as "noidea", I personally disagree that we should distinguish this kind of rubbish in any way (except possibly in an auto-generated "Did you mean?" tip). And I think there is a lot more rubbish where that came from, i.e. the Internet. Mihia (talk) 21:05, 23 September 2021 (UTC)Reply
Now that I'm looking again at the relevant section of WT:CFI, I'm afraid you're right about independence. I think two uses shouldn't be considered independent if they're from the same very small clique, even if the writers aren't the same person, because a dictionary is supposed to document languages and dialects rather than the habits of a linguistic community as small as one family or one bridge club. However, adjusting that part of WT:CFI should probably be its own proposal, instead of folded into this one. Do you agree? —Kodiologist (t) 21:34, 23 September 2021 (UTC)Reply
@Mihia: The number of three only made sense in so far as editing standards, of the type of sources allowed, limited what could reach that number. There is no five no ten and no thirty that would satisfy. That’s why in my formulation this number is abandoned, and it is left to judge whether an entry is made out of rubbish or there is more behind it, because that’s what you want to have the option to do, right. Fay Freak (talk) 15:19, 20 September 2021 (UTC)Reply

I have long argued in favour of a total overhaul of the policy on durably-archived sources. But I'll be voting against this proposal as it currently stands due to the issues raised by SGconlaw. There needs to be clear and firm guidance on what constitutes a reliable source. An anything-goes approach would be an absolute disaster. It would make it easy for anyone to game protologisms or jokes into mainspace, or to fill entries with quotations that serve no purpose but self-promotion or the dissemination of fringe material. "We should quote Nazi websites" is not an argument that merits any degree of consideration or discussion. WordyAndNerdy (talk) 13:41, 22 September 2021 (UTC)Reply

(1) I don't see how this proposal would be easy to game, considering the other CFI requirements that would still be here after the change. You would need to get your neologism used in three independent cases other than yourself in a span covering at least two years, at which point, it seems to me your word has legitimately caught on enough to merit inclusion; congratulations. (2) It's quite possible right now to fill entries with quotations to one's own works or to fringe material (remember that you can always include extra quotes that don't count for attestation), and the only change this proposal makes to that is to ban quoting yourself, so if you're against that, that's a reason to be in favor of the proposal. The status quo is already anything-goes so long the utterance is posted on Usenet instead of Twitter. (3) I don't relish quoting Nazis any more than you do, but if a word is used only by Nazis, better to document its use than to leave the word uncited. We have no policy excluding quotes on the basis of taste or the moral depravity of the speaker, and with good reason: it would open a huge can of worms. Remember that inclusion of a word or quote in a dictionary in not an endorsement of that word or quote. —Kodiologist (t) 14:18, 22 September 2021 (UTC)Reply
I'm sorry if the above comment came across as unduly harsh. Your proposal is a solid attempt at modernizing CFI in isolation. It's just that there are other factors that work against it. Look at my edit history. I don't shy away from documenting unpleasant terminology using CFI-compliant sources. I just believe a bright line needs to be drawn to prevent the use of Wiktionary for the dissemination of nonsense that wouldn't pass muster as a reliable source on Wikipedia. This is an issue that came up last year. And it's non-negotiable if Wiktionary wants to maintain its reliability and accessibility both to readers and contributors. I've previously suggested allowing Twitter to be used in place of Usenet, since it is easily searchable, has a large and active userbase, and isn't likely to disappear any time soon. WordyAndNerdy (talk) 15:41, 22 September 2021 (UTC)Reply
I think the tone of your comment was fine; it's no problem. It seems there is agreement that Wiktionary should continue to include slang, and, to that end, cite some kinds of utterances that would not pass muster as reliable sources for claims of fact over on Wikipedia. At the same time, you (and several others) feel that my rule, of allowing any web page that's snapshotted on the Internet Archive, would be too liberal. This leaves me clueless both where to draw the line and how such a line would be motivated. Like, allowing Twitter in addition to Usenet would be a big improvement over Usenet alone, I agree. But why would you allow Twitter but not Reddit, Facebook, Instagram, the xkcd forum, or personal websites? Twitter, like Usenet, has no real editorial control or other quality control. Tweets can be deleted or hidden arbitrarily, so we'll need to use the Internet Archive or some other archive no less than for web pages. —Kodiologist (t) 17:26, 22 September 2021 (UTC)Reply
Content on Twitter is openly accessible in contrast to Facebook and Instagram. The Library of Congress archived every public tweet from 2010 to 2017. I think that gives Twitter an edge over other social media platforms. I'm not opposed to allowing Facebook, Reddit, or Tumblr content to be cited in principle. The preclusion of Tumblr and Reddit as citation sources makes it particularly difficult to attest newer fandom slang. (There's a five year lag, I'd say. I'm only now starting to be able to reliably attest MCU and Sherlock fandom terms.). I just think we need to have a manhole cover in place to keep sludge out of mainspace. Wikipedia has more well-developed safeguards in place (RS, notability). WordyAndNerdy (talk) 02:51, 23 September 2021 (UTC)Reply

What does it mean to archive something?[edit]

"Sources that are cited to fulfill the attestation requirement must be accessible to future readers." This is unknowable. The OED can act as their own archive: it doesn't matter if the source (website) exists in the future, or any external archive of it: they have recorded the usage in their own database and that is good enough, since they are staffed by reliable editors that would not invent things. For us it is a different matter- how can we say that we can be our own archive? Anyone can add a quote and the likelihood that someone will immediately check that it currently exists (and then what?) is not low, but not guaranteed either. And there's no guarantee that the Internet Archive will exist in the future. Just things to think about. DTLHS (talk) 01:49, 20 September 2021 (UTC)Reply

Or consider this scenario: a word is challenged at RFV with quotations that no longer appear on any archiving services. What should the procedure be if no other quotations can be found? DTLHS (talk) 01:55, 20 September 2021 (UTC)Reply
Then the word no longer meets CFI, right? So it should be removed. If no replacements could be found, I would start to be skeptical that the ostensible quotations ever existed, so this seems like the right call. —Kodiologist (t) 12:02, 20 September 2021 (UTC)Reply
Is there any precedent from languages becoming 'well-documented'? Suddenly a single authoritative usage ceases to be sufficient. --RichardW57m (talk) 11:55, 4 October 2021 (UTC)Reply
I agree that we aren't an archive, which is one reason I recommend using the Internet Archive instead of copying pages onto some Wikimedia resource. The question of whether the Internet Archive will be accessible in the future is not unknowable; it is merely uncertain, and the uncertainity that applies is rather small for realistic time horizons, because the Internet Archive is the most serious, ambitious, and well-funded organization for this exact purpose. Worrying about it disappearing is like worrying about the New York City Public Library disappearing. —Kodiologist (t) 12:02, 20 September 2021 (UTC)Reply

"A Wiktionary editor must not cite their own utterances"[edit]

There are loopholes here: two editors could cite each other, etc. (as the writers of vanity-press books often do with reviews). Maybe we should look at WP's policy on not editing things that you are involved in. Equinox 17:41, 20 September 2021 (UTC)Reply

That's a fair point: two people whose works are completely separate could agree to pimp each other's works by citing the other as much as possible on Wiktionary. I'm not sure what to do about this. Perhaps Wiktionary:Spam should be amended. If nobody's actually done it yet, maybe we should wait until the problem actually arises before making a new rule. I know Wikipedia is a big target for spammers, but I don't know how much Wiktionary has had a problem with it. —Kodiologist (t) 17:55, 20 September 2021 (UTC)Reply
@Kodiologist: I don’t think anyone should do anything about this, and certainly not in connection with this vote. Fay Freak (talk) 20:18, 20 September 2021 (UTC)Reply
I don't like the word 'utterance'. If I ask the Internet Archive to archive something, have I uttered it? --RichardW57m (talk) 11:01, 1 October 2021 (UTC)Reply
Yes, but only in the same trivial sense that when I read from a piece of paper on which I wrote a Lincoln quote, I'm uttering the same thing he did. I'm still quoting Lincoln's utterance, not my own. —Kodiologist (t) 15:42, 1 October 2021 (UTC)Reply

Provide examples[edit]

What are some words that failed RFV or barely passed that would be citable with this proposal? What are some words that editors have feared to enter? Vox Sciurorum (talk) 20:39, 21 September 2021 (UTC) 20:39, 21 September 2021 (UTC)Reply

The RFV that got me started down the road that led to this proposal was for a word I added, "sniddy". It's easy to find widespread use of this slang word on the Web, but not on Usenet (it's too recently coined) or in print (it's too fannishly obscure). I originally thought WT:ATTEST clearly allowed web pages if they were archived, but it turns out there's considerable disagreement about what the current wording means, so here we are.

"murderhobo" is another word I added that's well-attested on the Web but probably not elsewhere. We have a Usenet citation, but only one. It is no more or less ripe for RFV than "sniddy", I think, but nobody has pulled that trigger. —Kodiologist (t) 21:19, 21 September 2021 (UTC)Reply

Update: "murderhobo" been has RFVed in reaction to my comment here. —Kodiologist (t) 11:25, 22 September 2021 (UTC)Reply
It is not clear what you expected to happen when you told us about it as an example publicly. J3133 (talk) 11:35, 22 September 2021 (UTC)Reply
Some more examples of slang currently undergoing RFV that are well attested on the Web: hunbot, PMV, Talibro, apefirmative action, glowie. (Well, "PMV" is probably not slang per se.) —Kodiologist (t) 13:07, 22 September 2021 (UTC)Reply
There's lots of internet or gaming jargon that's currently missing and that I'm unable find good citations for on Google Groups. To give a few examples: onetap/one tap, larp (internet sense missing), spinbot, phonepost, clutch (has a more specific meaning in team shooters).
I'm also not sure if entries like sussy, slide into the DMs or cheese (ety 4) would survive if somebody were to RfV them (please don't). Fytcha (talk) 14:58, 14 November 2021 (UTC)Reply

Ormulum[edit]

What is the basis for including material from the Ormulum? How is it 'reliable', as opposed to genuine? As far as I can tell, it failed a quality threshold in that it was not copied, so it is no more a book than someone's journal. We even have a whole category for terms found only in it! The only defence I see for using it is that what is left of it is now well-archived. One might even argue that it was written to push his phonetic spelling scheme - surely a disqualification! --RichardW57m (talk) 16:20, 6 October 2021 (UTC)Reply