Template talk:hu-hyphenation

From Wiktionary, the free dictionary
Latest comment: 2 years ago by Panda10 in topic unhyphenated syllables: a tentative list
Jump to navigation Jump to search

unhyphenated syllables: a tentative list[edit]

(Moved from User_talk:Panda10#unhyphenated_syllables:_a_tentative_list)

I tried to make a list of the affected lemmas (though not thoroughly; there may be some omissions): see User:Adam78/todo/unhyphenated syllables. The affected unhyphenated lemmas comprise about every seventh (14.9%) of all unhyphenated lemmas (3,726 out of 24,959). When we template is created (I need to think some more about its implementation), we might consider asking a bot owner to do the changes, if we can properly define where this symbol needs to be inserted (within the new template). Adam78 (talk) 22:29, 20 December 2021 (UTC)Reply

@Adam78 The first step might be creating {{hu-hyphenation}} and the corresponding Module:hu-hyphenation. The current hyphenation module can be copied for now. Then I can move this discussion to the template's discussion page. I think we should also have a list of non-lemmas, since we do add hyphenation to them, too. I don't understand this sentence: "The affected unhyphenated lemmas comprise about every seventh (14.9%) of all unhyphenated lemmas." Did you mean of all hyphenated lemmas? Panda10 (talk) 21:49, 21 December 2021 (UTC)Reply

I meant lemmas that do not have a hyphen, such as suffix lemmas do (because originally I excluded suffixes and prefixes, which have word-final and word-initial hyphens, as well as hyphenated terms like ezüst-klorid and also multi-word expressions: I listed the former groups separately, after the primary list). I agree that it should be extended to non-lemma forms, though I hope this list can also help us find the relevant ones out of them. In fact, I realized that I needn't have collected these lemmas since a bot could find those elements in the hyphenation lines that have more than one vowel between two vertical bars. However, this list may still be useful to check the operation of the bot and also there may be some false positives, namely terms of foreign origin where vowel letters cannot be separated (e.g. siemensi). OK, I'll try to create the template (I'm not sure yet if we need a module). Adam78 (talk) 23:07, 21 December 2021 (UTC)Reply

The first problem is the vertical bar. We either supply some other symbol (by default, a middle dot) instead of it, or we rework the template with subordinated conditions like "if there is a second parameter, display it, if there is a third, display that too," etc.

However, I can't come to terms with using the normal period. It's too similar to the middle dot, and it's kind of strange to expect the reader to pay attention to the vertical position of the dot. Almost like musical notes – or the Braille system, or dominoes, or dice… However, if the space between hyphenable elements could be increased, it could work, e.g. by means of a katakana middle dot: "e‧lő・a‧dás" (which would imply that this symbol should be used in other Hungarian hyphenations as well). If there is a module inside the template, applied in all hyphenation lines in Hungarian entries, it could replace manually supplied periods with middle dots and vertical bars with katakana middle dots (or whatever else). Adam78 (talk) 19:02, 22 December 2021 (UTC)Reply

@Adam78 When calling the new {{hu-hyphenation}}, how about using two vertical bars in places that are not usually hyphenated: {{hu-hyphenation|a||kár|mi|lyen}}? The "e‧lő・a‧dás" format will not look good in long words: "ál ‧ lam ‧ i‧gaz ‧ ga ‧ tá ‧ si". The middle dot is used in all the other languages. If we don't have to conform to that, then we can implement the same symbols as in Laczkó-Mártonfi Helyesírás: ál-lam-i‧gaz-ga-tá-si. This would also work in ezüst-klorid because it can be hyphenated as ezüst- (end of line) and klorid (start of line). Panda10 (talk) 20:18, 22 December 2021 (UTC)Reply

@Panda10: The double vertical bars just create more entries for parameters. If you don't like "ál・lam・i‧gaz・ga・tá・si" (there are no spaces here, only a different kind of dot), we might also consider the half-width middle dot for katakana: "ál・lam・i‧gaz・ga・tá・si" (source for both: w:Hyphen#Unicode), although it's too similar to the middle dot. Well, maybe after all we could still return to the good old Osiris.

I'm afraid I can't manage it on my own: there is an error whether I invoke it as {{hu-hyphenation|e.lő|a.dás}} or as {{hu-hyphenation|hu|e.lő|a.dás}} I think I'll need some help from the Grease Pit. Adam78 (talk) 21:06, 22 December 2021 (UTC)Reply

@Adam78 I forgot that the double vertical bar separates hyphenation variants. It is a good parameter, we want to keep that for the same purpose. What about using the pound sign? Here is what I think:
  1. The user enters {{hu-hyphenation|a#kár|mi|lyen}}.
  2. The {{hu-hyphenation}} takes this input and sends it to Module:hu-hyphenation.
  3. The module checks for the # character and changes it to the special character to be used between unhyphenated vowels.
It would be nice if the template could do this checking for # and changing it because in that case we would not need a Hungarian-specific module but I'm not sure if this script language can parse a string.
The special character to be displayed between the unhyphenated vowels has to be less prominent than the other character that separates the rest of the syllables. That's why I have a hard time to agree to characters that appear to be more important than the current middle dot. If we can't come up with a good character, then we will have to turn to the Osiris style (a‧kár-mi-lyen) after confirming with the community that not conforming to this wiki's style is acceptable. Panda10 (talk) 12:49, 23 December 2021 (UTC)Reply