https://doi.org/10.25312/j.8944


Artem Velychko https://orcid.org/0009-0001-1299-0377 University of Bialystok

e-mail: a.velychko@uwb.edu.pl


Lexicographic issues in compiling bilingual learner dictionaries of idioms. Part II: Developing dictionary model entries

Kwestie leksykograficzne związane

z tworzeniem dwujęzycznych słowników idiomów dla osób uczących się. Część II: Opracowywanie haseł modelu słownikowego


Abstract

This article is the second in a series devoted to common lexicographic issues arising in compiling bilingual learner dictionaries of idioms. It discusses selected aspects of the structure of such dictionaries and describes a dictionary model created to illustrate practical lexicographical solutions. Specifically, the structural aspects discussed here encompass macrostructural issues such as the overall arrangement of idioms, mediostructural aspects involving the use of cross-references, and microstructural issues related to the formulation of definitions, selection of example sentences, and provision of linguistic labels. The resulting dictionary model features 34 entries of English weather idioms. These expressions are likely to engage learners due to the cultural significance of discussing weather in the British Isles. Yet, the solutions discussed here are broadly applicable and could be used in the compilation of bilingual dictionaries of idioms in general.

Keywords: bilingual dictionary, idiom dictionary, lexicography, lemma arrangement, example sentences, dictionary definitions, weather idioms

Streszczenie

Niniejszy artykuł jest drugim z serii poświęconej powszechnym problemom leksykograficznym pojawiającym się przy tworzeniu dwujęzycznych słowników idiomów dla osób uczących się. Omówiono w nim wybrane aspekty struktury takich słowników i opisano model słownika stworzony w celu zilustrowania praktycznych rozwiązań leksykograficznych. W szczególności omówione tutaj aspekty strukturalne poruszają kwestie makrostrukturalne, takie jak: ogólny układ idiomów, mediostrukturalne, obejmujące stosowanie odsyłaczy, oraz mikrostrukturalne, związane z formułowaniem definicji, wyborem przykładowych zdań i dostarczaniem wskaźników językowych. Powstały model słownika zawiera 34 hasła angielskich idiomów pogodowych. Te wyrażenia prawdopodobnie zaangażują uczniów ze względu na kulturowe znaczenie dyskusji o pogodzie na Wyspach Brytyjskich. Jednak omawiane rozwiązania mają szerokie zastosowanie i mogą być wykorzystane przy tworzeniu dwujęzycznych słowników idiomów w ogóle.

Słowa kluczowe: słownik dwujęzyczny, słownik idiomów, leksykografia, układ lematów, przykładowe zdania, definicje słownikowe, idiomy pogodowe


Introduction

Idioms constitute a crucial and unique aspect of each language, posing difficulties for language learners, who often find it challenging to discern their actual meanings which, having been shaped by cultural factors for centuries, are now embedded in seemingly familiar words. To ensure that the representation of these lexical items is exhaustive and convenient for dictionary users, various lexicographic issues need to be addressed. The first part of this series (Velychko, 2025) deals with the identification of Polish idiomatic counterparts for English idioms. Having selected 30 Polish equivalets for 34 English id- iomatic expressions, the present article takes the next step by producing a lexicographic representation of these items.

This article discusses central issues arising in the process of compiling a bilingual learner dictionary of idioms, thus focusing on arranging lemmata, formulating definitions, providing cross-references, choosing example sentences, and adding linguistic labels. In addition to outlining possible solutions to the problems typically accompanying these procedures, this article illustrates the application of some of the discussed solutions in practice by creating a dictionary model on the theme of weather idioms. Section one describes the dictionary model of weather idioms by discussing its topic, primary and secondary functions, target users, and idiom types included. The following section focuses on approaches to arranging entries and providing cross-references. Section three discusses approaches to formulating definitions, while section four illustrates methods of choosing example sentences from corpora. Section five touches upon linguistic labels indicated in the dictionary model entries. The last section presents the result of this work, which is 34 entries of the bilingual dictionary model of weather idioms. The article ends with a summary of the previously discussed topics.

Description of the bilingual dictionary model of weather idioms

The topic covered by the dictionary model is English weather idioms and their Polish idiomatic and non-idiomatic equivalents. These expressions either embrace weather-re- lated words, e.g. to be (as) right as rain, or are directly relevant to the topic of weather conditions, e.g. a cold snap. Having been shaped for centuries, this class of vocabulary now constitutes an integral part of the English language. Owing to the geographical location of the British Isles, the factor of changeable and unpredictable weather has had a tremendous impact on people’s way of thinking and their speaking habits, thus causing the emergence of many weather-related idioms and expressions (Zoltán, 2013: 270). Many English weather expressions, including idioms and metaphors, are rooted in associations drawn between certain weather conditions and habitual perceptions of human life aspects. Accordingly, a storm is associated with the manifestation of aggression or upcoming hard times, whilst rain is believed to be a sign of misfortune (Żołnowska, 2011: 172f).

The dictionary model of weather idioms is passive, meaning that it provides transla- tions of lemmata from L2 into L1, that is, from English into Polish. Nevertheless, this reference work may be advantageous for active use as well since it includes both linguistic data typical of monolingual dictionaries, such as SL (source language) definitions and example sentences, as well as linguistic data typical of bilingual dictionaries, such as TL (target language) equivalents, definitions, and example sentences. The primary focus on the reception component is due to the fact that it is idiom reception and L2 to L1 trans- lation that pose the most difficulties for L2 dictionary users, while in production, they can convey the same ideas by using paraphrases or other non-idiomatic lexical items (Svensén, 2009: 193). Consequently, the dictionary model is aimed primarily at Polish native speakers who are expected to have at least an intermediate (B1) and preferably upper-intermediate (B2) level of proficiency in the English language according to the Common European Framework of Reference (CEFR).

The dictionary is designed to be consulted by high school and university students as well as by individuals willing to enrich their English language lexicon. As mentioned ear- lier, the secondary objective of the dictionary is to aid native speakers of Polish in active usage, which is frequently unfulfilled by typical bilingual dictionaries that limit their range of use by focusing solely on equivalents and not including example sentences, foreign language definitions, etc. To summarise, the most relevant information a Polish native speaker can retrieve for decoding purposes is English definitions and Polish equivalents, and for encoding purposes – explanations and examples of usage in English, grammatical information, as well as information on pronunciation, which is sometimes included for idiom constituents that are not widely used at B1 and B2 levels. The dictionary can also be consulted as a bidirectional active dictionary by learners of Polish who are proficient in English and whose level of Polish is expected to be at least upper-intermediate (B2). This higher proficiency level is attributed to the more advanced vocabulary of Polish definitions and citing examples. Consequently, for encoding, the users will find Polish equivalents of English idioms and their examples of usage in context.

The total number of English weather idioms in the dictionary model is 34, while the number of their Polish equivalents identified in the first part of the study (Velychko, 2025) equals 301. Various idiom types, i.e. sayings or sentential idioms and proverbs or non-sentential idioms2, are included. Headwords of sentential idioms are capitalised to reflect their appearance in natural text:

a lightning conductor

Lightning never strikes (the same place) twice put/keep sth on ice

The dictionary is synchronic since it presents idioms that are either extensively used in contemporary language or are widely recognised by native speakers of both languages. For this reason, idioms such as it is raining cats and dogs3 or to keep a weather eye on something/ somebody are present in the dictionary. The selection process considered an idiom’s presence in well-known dictionaries such as the Collins Cobuild Dictionary of Idioms (CCDI) (1995), the Cambridge International Dictionary of Idioms (CIDI) (1998), the Merriam-Webster Dictionary (MWD) (2004), and A Dictionary of American Idioms (DAI) (2004).


Lemma arrangement

In dictionaries, an idiom can appear as a lemma in entries or as a subentry under one of its constituents’ entries. Given its exclusive focus on idioms, the dictionary model adopted the former approach, which was more suitable due to the independent status and opaque meanings of idioms (Svensén, 2009: 194f).

A separate issue is that of the appropriate arrangement of lemmata in the dictionary mac- rostructure. Placing multi-word entries based on the most semantically prominent component may be beneficial for L1 users, who are likely to locate a desired lexical item through logical reasoning. L2 users, in turn, would benefit much more from a form-based searching system. By way of example, a structural approach can be adopted for placing idioms in a dictio- nary (Yong, Peng, 2007: 181ff). Thus, several issues on the part of a dictionary compiler are addressed through the structural description and categorisation of idioms. These issues often include determining an idiom’s key and secondary constituents, reducing cross-references for the sake of space-saving, and preventing inconsistency and confusion over the treatment of idioms of similar structures. In this regard, each category is defined based on certain structural combinations and their constituent types, with a key constituent being a determining factor that contributes to an idiom’s meaning. Grammatical words are neglected, whereas lexical words play a prominent role in entry arrangement. Six main categories of phrase idioms4 can


1 It is noteworthy that for some English idioms, Polish idiomatic equivalents were not identified, and Polish definitions were provided instead, hence the difference between the number of English idioms and Polish equivalents.

2 A detailed discussion on idiom typology can be found in the first article of the series.

3 See publications by Rundell (1995) and Takaie (2002), who express opposing views regarding the actual usage and native speakers’ familiarity with the idiom it is raining cats and dogs.

4 Adopted from Yong and Peng (2007: 177ff), this classification of idioms according to their word class dis- tinguishes between phrasal idioms and sentence (sentential) idioms. The former include fixed combinations such as verb + noun, verb + noun + preposition, preposition + noun, adjective + preposition + noun, etc., while the latter

be distinguished and placed under the corresponding entries: nominal idioms (break the ice, rain or shine, etc.) at noun entries, adjectival idioms (fair-weather friend, hot and bothered, etc.) at adjectival entries, pronominal idioms (all or nothing, something of a (something), etc.) at pronominal entries, verbal idioms (check in, turn something off, etc.) at verbal entries, adverbial idioms (on the never-never, once again, etc.) at adverbial entries, and numeral idioms (at sixes and sevens, for two cents, etc.) at numeral entries. Although not exhaustive, this categorisation still offers several significant advantages. Since an idiom’s placement is determined based on the above guidelines, it is practically unnecessary to provide cross-ref- erences, which makes this approach space-saving. More importantly, lexicographers will benefit from this method as it greatly facilitates entry arrangement and leaves no doubt as to where a certain idiom should be placed. As regards users, to avoid difficulties when con- sulting a dictionary, they should refer to the user’s guide. However, those who lack a general knowledge of parts of speech may still face some problems.

Sentence idioms are normally positioned under a headword that acts as a subject with- in a sentence. Consequently, the idiom every cloud has a silver lining would be placed under the headword cloud. As this example indicates, a noun phrase may be composed of one word, and users will easily locate this idiom. However, a noun phrase may also be represented by a group of words, as in birds of a feather flock together. The headword in the noun phrase birds of a feather should then be necessary for placing the idiom in mac- rostructure. In this case, the noun birds is a headword, and the whole idiom will therefore be placed under the lemma bird. Another possible noun phrase composition is a sequence of words including a clause as in those who are quick to promise are generally slow to perform. If the first element is a pronoun (those in this case) or another grammatical word, another keyword should be determined. This can be achieved by focusing on the words surrounding the main verb in the sentence or the main verb itself. In this idiom, the fitting entry is quick, because it is the closest word to the verb promise.

Verbless sentence idioms, as well as other idioms with complicated sentence structures, should be provided under the first keyword. In the idioms any port in a storm and it’s an ill wind that blows nobody any good, the keywords that should serve as lemma entries are port and ill, respectively.

An alternative, onomasiological approach to arranging lemmata involves grouping entries into semantic categories, where idioms with similar or related meanings are listed together (Svensén, 2009: 197; Michta, 2022: 90ff). Considering that the dictionary model of weather idioms is a thematic dictionary, it adopts the onomasiological approach rather than the form-based and structural or alphabet-governed, exclusively semasiological5 dictionary macrostructure. Furthermore, the idioms are arranged in a niche-alphabetic word list, where each niche-entry lemma, which is a heading lemma, is segmented into niches headed by niche lemmata (Bergenholtz, Tarp, 1995: 193ff). A prerequisite for


have features of sentences, e.g. subject-verb agreement, verb inflexions, etc. Notably, verbal idioms are treated here as phrasal verbs. This exact classification of idioms is provided for demonstrational purposes and, therefore, does not mean that the structural approach cannot be applied to other idiom typologies.

5 Being based on the expressional aspect, semasiological dictionaries arrange entries alphabetically pursuant to their spelling, pronunciation, rhyme, etc (Svensén, 1993: 23ff).

defining a thematic category and placing lemmata within the dictionary is constituents related to the topic of weather. Both niche-entry lemmata and niche lemmata follow the alphabetic order, with the niche lemmata having the letter-by-letter placement. In the ex- ample below, a niche-entry lemma is lightning, while niche lemmata are idioms a lightning conductor and a lightning rod:


LIGHTNING

a lightning conductor (BrE) a person or thing that is criticised for something, although other people are also responsible (cf. a lightning rod).

a lightning rod (AmE) (see a lightning conductor)

Whenever there is more than one weather-related word within an idiom, the first of them is selected as the main entry, and the remaining one or more are treated as secondary. Consequently, the idiom to blow hot and cold is located under the entries hot and cold, with all its microstructural information provided in the entry for hot:

HOT COLD

blow hot and cold (about sb/sth) to often change your attitude toward some- one or something so that people cannot understand your real feelings.

blow hot and cold (see HOT)

In order to guide users to the primary entry of the idiom, a cross-reference is provided under the secondary entry. In this case, under cold. Cross-references may occur in primary and secondary entries simultaneously if secondary ones include additional information about idioms, such as their alternative forms, regional varieties, etc. Cross-references are indicated by see or cf. markers. The indicator see guides users from one idiom to another to demonstrate its definition along with other microstructural data, while the cf. indicator directs users to an idiom’s alternative form or other expressions it may be related to. If idioms belong to one niche-entry lemma, the cross-references they contain are as follows (see/cf. idiom), as in the example below:

LIGHTNING

a lightning conductor (BrE) a person or thing that is criticised for something, although other people are also responsible (cf. a lightning rod)

a lightning rod (AmE) (see a lightning conductor)

When idioms are located under different niche-entry lemmata, cross-references are provided as either (see NICHE-ENTRY LEMMA) or (cf. idiom at NICHE-ENTRY LEMMA entry):

COLD

blow hot and cold (see HOT) or

a brass monkey /brɑːs/ (BrE, infor- mal) to be extremely cold (cf. brass monkey weather at WEATHER entry)

When directing to two or more idioms, the ‘&’ sign is placed between them, e.g. (cf. bring somebody/something in from the cold & come in from the cold).

Formulating definitions

Lexicographic definitions can be provided in different formats, which should be determined by dictionary functions, category of an elucidated lemma, and target users. The formats discussed in this section are analytical phrases, full-sentence definitions, synonyms (Ka- miński, 2021: 90), and paraphrases (Svensén, 2009: 201).

Analytical phrases, which are most effective for defining nouns, consist of a superor- dinate term along with further semantic characteristics that are peculiar to a given lemma when compared to other members of its category (Kamiński, 2021: 90). For example, the Cambridge Dictionary website defines the word cat as “a small animal with fur, four legs, a tail, and claws, usually kept as a pet or for catching mice”. In this definition, the phrase “an animal” serves as a superordinate term, whilst the following description lists more specific differentiating features of the word.

Commonly found in learner dictionaries, full-sentence definitions include an in- troductory clause, which features a given lexical unit within its typical context, and the main clause, which explains its meaning (Barnbrook, 2002). This type of linguistic information is particularly advantageous for elucidating verbs in dictionaries aiming to assist in language production. In addition to encompassing both contextual information and semantic characteristics of the lemma, an important advantage of such definitions is accessibility for language learners, which reflects the fact that they resemble natural conversations. Moreover, when examining full-sentence definitions in the Collins Cobuild English language dictionary (1987), where this definition model was first introduced, Adamska-Sałaciak (2012: 328) notes that it avoids purely technical lexicographic elements such as abbreviations, parentheses, tildes, slashes, and omitted articles. Consequently, full-sentence definitions are accessible even to those who might be unfamiliar with spe- cific lexicographic conventions.

The third definition format involves the provision of one or more synonyms, or near-syn- onyms, of a lemma. Synonym definitions offer paradigmatic information and are useful for text production and vocabulary expansion. This format is advantageous for space-saving purposes (Svensén, 1993: 119) and in cases when high semantic precision is not imperative (Landau, 2001: 398). The drawbacks it suffers from directly concern the notion of synonymy itself since identifying absolute synonyms in a language is rarely feasible (Kamiński, 2021: 90ff). Even if absolute synonyms exist, they may turn out to be more challenging in terms of comprehension for dictionary users than the lemma itself. Other problems may arise when users are misled by possible regional, stylistic, contextual, and expressive features of provided synonyms, which lexicographers may unwittingly overlook. In some dictionaries, however, it is an established practice to provide both analytical and synonym definitions in one entry. Lemmata comprised of different parts of speech, as well as of idioms, can likewise be explained by means of paraphrases. Svensén (2009: 201) points out that definitions of motivated idioms6 may reuse one of their components, e.g. to be as cold as ice to


6 In Kvetko’s (1999: 43f) idiomatic typology, motivated idioms are referred to as phraseological combinations, whereas Svensén (2009: 190ff) uses the term semi-idioms.

be very cold7 or to get into hot water to get into a situation when someone is angry at you and you are likely to be punished or criticised. This method of explaining meaning is designed to align as closely as possible with the structural and semantic features of idiomatic expressions.

In the English-Polish dictionary model of weather idioms, descriptive explanations of meanings are given in both L2 and L1. English definitions occur much more frequently because for most English weather idioms, Polish idiomatic equivalents were identified and there was no need to provide additional Polish definitions considering target users and dictionary functions.

SL (English) definitions for lemmata are provided on the grounds that TL (Polish) equivalents of idioms do not always fully correspond to SL items and, in many instanc- es, merely constitute their approximate analogues (Svensén, 2009: 201ff). These were formulated in the format of paraphrases. There were several reasons for the choice of this format. First, as stated earlier, analytical phrases are normally employed for defining nouns rather than multiword expressions such as idioms. Second, similar to Polish equivalents, English synonyms did not always sufficiently match the lemma semantically. Where it was necessary, the data concerning synonymous or antonymous idioms related to the lemma was provided in the form of cross-references and, therefore, there was no need to list them twice in a single entry. Moreover, explaining the meaning of idioms, which are mostly multiword expressions, through synonyms was oftentimes unfeasible. Last but not least, the paraphrases in the dictionary model predominantly constitute combinations of two or more full-sentence definitions found in a variety of reference sources. For space-saving purposes, it was decided to retrieve and focus only on the semantic aspects of these definitions, especially because the weather idioms were further exemplified in the field of cited examples.

The sources used for formulating definitions included paper dictionaries such as the CIDI, MWD, CCDI, and DAI, as well as online dictionaries such as the Cambridge Dictio- nary website, the Merriam-Webster Dictionary website, the Oxford Learner’s Dictionaries website, and the Lexico website. To give an example, the definitions for three related idioms to come in from the cold, to leave somebody/something out in the cold, and to bring somebody/something in from the cold are based on those found in CIDI and CCDI. Table 1 shows the definitions from these dictionaries and the weather dictionary model definitions resulting from them. It is worth mentioning that the third idiom’s definition constitutes a combination of the two remaining idioms since neither CIDI nor CCDI defines this expression.


7 These definition examples are taken from the dictionary model compiled as a part of the current article.

Tab. 1. Definitions formulated on the basis of the CIDI and CCDI


Idiom 1

to come in from the cold

CIDI definition

If someone comes in from the cold, they become part of a group or an

activity which they were not allowed to join before.

CCDI definition

If someone or something comes in from the cold, they become popu- lar, accepted, or active again after a period of unpopularity or lack of

involvement.

Resulting definition

to join or become popular again in a group that did not accept you before or to finally start or return to participating in an activity which you were

not allowed to do before

Idiom 2

to leave somebody/something out in the cold

CIDI definition

to not allow someone to become part of a group or an activity

CCDI definition

If a person or organisation is left out in the cold, they are ignored by other

people and are not asked to take part in activities with them.

Resulting definition

not to let someone or something join or return to a group or participate

in an activity again

Idiom 3

to bring somebody/something in from the cold

CIDI definition

the definition is absent

CCDI definition

In the form of an additional explanation in the entry for to come in from

the cold: “You can also say that they are brought in from the cold”.

Resulting definition

to let someone or something join or return to a group or participate in an

activity which they were not allowed to do before

Source: own elaboration based on the sources indicated in the table.


In cases when entries featured polysemous lemmata, these were divided into several sections, each representing a certain sense, as in the following example:

(come) rain or shine8 1. whatever the weather is. 2. whatever happens.

Another important aspect of formulating L2 definitions that lexicographers should consider is the intended user’s proficiency in the foreign language. Controlling definition vocabulary is a common procedure that ensures the match between L2 proficiency level and the sophistication of the lexicon employed in definitions. According to Neubauer (1989: 900), there are five methods of controlling definition vocabulary. The first method is the most unrestricted, allowing for the use of any words to define lexical units. The second approach relies on utilising ‘simple language’ but with no strict criteria as to which words should be avoided. The third method is based on a specified list of words, most of which should be prioritised in making definitions. In the fourth approach, all the defining vocabulary items must belong to a specified list of words, while in the last approach, not


8 Even though the first sense of this expression might not seem to be figurative, both senses are provided in the CIDI, and, consequently, are included in the dictionary model.

only defining vocabulary but also the meanings of words in definitions must be strictly controlled by applying only predetermined ones.

The dictionary model assumes that native Polish speakers are expected to know En- glish at B1-B2 levels. Consequently, the English definitions were formulated by adopting the third approach, primarily utilising B1-level vocabulary as outlined by the Common European Framework of Reference (CEFR). This lexicon was prioritised, though not exclusively, in the majority of cases when word selection decisions had to be made.

In some dictionaries, words and expressions are frequently indicated by proficiency labels according to CEFR, which came in handy in compiling this lexicographic work.9 By way of example, of the candidate definitions 1 and 2 for the idiom to be on ice, only the latter matched the level of users’ language competence and was included in the dic- tionary model.

Definition 1: to be deferred or put off indefinitely.

Definition 2: to be delayed or postponed for a period of time.

L1 definitions were predominantly based on paraphrases of SL definitions and para- phrases from such websites as ReversoContext, Etutor, and Bab.la. Consequently, consider- ing that for the English idiom to catch somebody cold, a Polish equivalent expression was not identified, it was given the Polish definition “zaskoczyć kogoś”, which is a paraphrase of the English definition “surprise somebody with something ”.


Selecting example sentences

Since this is a descriptive general-purpose dictionary, example sentences from general language corpora are used to demonstrate idiom usage in context. The major advantage of corpus-based examples over those invented by a human is that the former are more likely to feature an unbiased selection of words and more accurately capture grammatical nuances and possible restrictions of the exemplified lexical units (Hanks, 2009: 225ff). Two perspectives on the proper example selection from corpus data can be identified. One of them argues for choosing exclusively highly authentic language samples produced in real-life situations or those that served actual communication purposes. The contrary view on this issue is that authentic corpus materials tend to abound in digressions, con- text-related implications, and other potentially excessive data. Consequently, under this view, lexicographers should strive for idealised versions of authentic sentences, which are concise, uncomplicated, and straightforward. It is noteworthy that for some lexical items, the identification of idealised examples in corpora may be unfeasible.

For this study, an intermediate solution was employed, choosing English and Polish examples in accordance with both perspectives on corpus data selection. Some of them are quite long and sometimes consist of more than one sentence (especially those featuring


9 Alternatively, the Oxford Text Checker tool can help to ascertain the complexity of lexical items that are not labelled in available dictionaries. The tool can be accessed at https://www.oxfordlearnersdictionaries.com/ text-checker/.

sentential idioms), which is useful in giving a comprehensive account of the expressions in their typical context and achieving a higher level of authenticity. At the same time, great care was taken to ensure the context was simple, unambiguous, and unaffected by such unfavourable factors as, for instance, loose ends, digressions, and so on.

English example sentences were collected from the corpora available on English-Cor- pora.org, i.e. News on the Web (NOW), the TV Corpus, the Movie Corpus, and COCA, as well as from those available on Sketch Engine, i.e. the Film Corpus, the English Web Corpus (enTenTen), and the Spoken British National Corpus 2014. In turn, Polish exam- ple sentences were collected utilising the Narodowy Korpus Języka Polskiego10 website. Similar to definitions, example sentences had to correspond to the level of users’ L2 proficiency. Thus, of the candidate example sentences 1 and 2 for the idiom to be skating/

walking on thin ice, the latter was chosen due to posing fewer challenges for L2 users: Example sentence 1: Hofi was walking on thin ice as he never passed up an opportunity to ridicule the obvious absurdities of the current government.

Example sentence 2: Sometimes buying a used car can be like skating on thin ice.

Not only was the sentence simplicity considered in the selection process but also the context in which the sentences appeared. For this reason, in most instances, sentences including people’s personal data or specific and factual knowledge were avoided. Both example sentences 3 and 4 feature the idiom to bring somebody/something in from the cold. Yet, only example sentence 4 was chosen since the information it presents does not require any specific political or historical background to comprehend the text.

Example sentence 3: Tory Peter Jones said it was time South Africa was brought in from the cold.

Example sentence 4: The company finally brought its former workers in from the cold

after months of intense negotiations.

Other topics a dictionary compiler should be cautious about when selecting citing examples, as well as providing definitions, include race, gender, religion, ideology, and cultural nuances of, for instance, communities being described in a reference work (Adamska-Sałaciak, 2012: 332ff). To avoid potential alienation of the user, one can ape- proach the process of providing extralinguistic information selectively, adopting a mini- malist approach in cases of uncertainty.

At times, additional caution was needed since some examples included words or phras- es graphically identical to certain idioms but were literal in meaning, thus should not be used in the dictionary. Example sentences 5 and 6 are literal and figurative, respectively. Example sentence 5: A figure skater in the sparkly dress was skating on thin ice.

Example sentence 6: Sometimes buying a used car can be like skating on thin ice.

As recommended by Bergenholtz and Tarp (1995: 203f), in bilingual and monolingual dictionaries, examples should not only appear in succession, but also follow SL and TL order or, more specifically, the lemma and equivalent order. Following the authors’ advice, example sentences immediately follow the equivalents within each entry, with Polish sentences coming after English ones. To facilitate the discrimination of sentences from


10 The National Corpus of the Polish Language.

the remaining parts of an entry, these are typed in italics and led by an arrow sign ‘‣’, with exemplified idioms themselves being displayed in bold type. English example sentences occur first regardless of their number, whereas Polish ones are provided only if Polish equivalents also constitute idioms.


Labels

When required, labels were provided for lemmata in brackets. All the labels were typed in italics and placed in the following order: diatopic label, diachronous label, register, and passive voice. If two or more labels occur in one entry, they are separated by commas.

As the first label presented in the label field, a diatopic or diatopical label indicates a language variety in which a particular expression is most frequently used (Hartmann, James, 1998: 40). When more than one label appeared, preference was given to British English. Some entries include diachronous labels represented by the old-fashioned des- ignation. The register is illustrated by formal and informal labels, and if neither of them is provided, the expression is assumed to be neutral in style. A label indicating an expres- sion’s passive voice appears after the register label and contains a frequency indicator usually passive. Additionally, IPA phonetic transcription was given in several entries. Only those idiom elements potentially unfamiliar to Polish speakers with intermediate (B1) or upper-intermediate (B2) proficiency in English were transcribed. Importantly, the phonetic transcription reflects the pronunciation characteristic of British English.

Entries of the bilingual dictionary model of weather idioms

This section illustrates the resultant entries of the dictionary model of weather idioms. This reference work was created according to the principles discussed in the previous sections. When it comes to idiom selection and equivalent identification, the entries include items that were captured as a result of the analysis performed in the first part of this series. This section provides their lexicographic description. The dictionary model microstructure has the following format:

NICHE-ENTRY LEMMA

Niche lemma (diatopic, diachronous, register, voice labels) /pronunciation/

SL definition (see/cf. cross-reference). TL equivalent/TL definition

20. What do you know about life?

(come) rain or shine

  1. whatever the weather is. (czy) słońce czy deszcz

  2. whatever happens. (czy) słońce czy deszcz


Summary

The current work is the second article on the topic of the representation of idioms in learner dictionaries. While the process of selecting the best-fitting TL idiom equivalents is shown in the previous article of the series, this article discusses a number of issues that should be addressed when creating a bilingual learner dictionary of idioms, including lemma arrangement, definition formulation, example sentence selection, as well as the provision of linguistic labels and cross-references.

The article provides examples of some of these lexicographic approaches by compiling the English-Polish dictionary model of weather idioms. The dictionary model presents English weather idioms and their Polish idiomatic and non-idiomatic equivalents, fea- turing 34 entries of 34 English and 30 Polish sentential and non-sentential idioms. It is a synchronic dictionary aimed at reception activities by Polish native speakers. In addition, Polish users can also use it for production, as well as learners of Polish can consult it as an active bidirectional dictionary. The onomasiological arrangement of entries is governed by a niche-alphabetic word list, with weather-related elements being a defining factor in

the idiom placement. Definitions of idioms constitute compilations of those found in other reliable sources, while example sentences adopt both perspectives on corpus-based data selection, hence are authentic and idealised at the same time. Linguistic labels appearing in some entries indicate language varieties, obsolescence of some lexical items, register, passive voice, and British pronunciation.

As regards further research on weather idioms, the current work could be expanded to include more topic-related expressions, especially focusing on those with high frequencies in large general-language corpora. Another aspect that could be enhanced in the future dictionary of weather idioms concerns the provision of more grammatical and contextual information about alternative forms of expressions, including more details regarding possible component variations. Introducing these modifications along with adding Polish definitions and pronunciation to the entries could increase the usefulness of the dictionary with regard to encoding activities.

Whatever approach might be adopted in the course of the expansion of this type of reference work, it is imperative to ensure that supplementing it with new information neither hinders its convenient consultation nor leads it away from its primary objectives. Sometimes, restrictions as to dictionary functions might prove beneficial for the users, since an overwhelming amount of linguistic data could potentially discourage them from using a dictionary.


References

Adamska-Sałaciak A. (2012), Dictionary definitions: problems and solutions, “Studia Lin- guistica Universitatis Iagellonicae Cracoviensis”, no. 129/4, pp. 323–339, https://doi.org/10. 4467/20834624SL.12.020.0804

Barnbrook G. (2002), Defining Language: A Local Grammar of Definition Sentences, Am- sterdam: John Benjamins Publishing Company, https://doi.org/10.1075/scl.11

Bergenholtz H., Tarp S. (1995), Manual of specialized lexicography: the preparation of specialised dictionaries, Amsterdam: John Benjamins Publishing Company, https://doi. org/10.1075/btl.12

Common European Framework of Reference for Languages: Learning, Teaching, Assessment

(2001), Council of Europe (CEFR), Strasbourg: Language Policy Unit.

Hanks P. (2009), The Impact of Corpora on Dictionaries, [in:] P. Baker (ed.), Contemporary Corpus Linguistics, London: Bloomsbury Publishing, pp. 214–236.

Hartmann R.R.K., James G. (1998), Dictionary of lexicography, London: Routledge, https:// doi.org/10.4324/9780203017685

Kamiński M.P. (2021), Defining with Simple Vocabulary in English Dictionaries, Amsterdam– Philadelphia: John Benjamins Publishing Company, https://doi.org/10.1075/tlrp.22

Kjellmer G. (1996), Idiomen, kollokationerna och lexikonet, “LexicoNordica”, no. 3, pp. 79–91. Kvetko P. (1999), Anglická frazeológia v teórii a praxi, Bratislava: Univerzita Komenského.

Landau S. (2001), Dictionaries: The Art and Craft of Lexicography, Cambridge: Cambridge University Press.

Makkai A., Boatner M.T., Gates J.E. (2004), A dictionary of American idioms, New York: Barrons Educational Series.

Michta T. (2022), Systemy pojęć w terminologii i słowniku, Białystok: Wydawnictwo Prymat. Mish F.C. (2004), The Merriam-Webster Dictionary, Springfield: Merriam-Webster.

Neubauer F. (1989), Vocabulary Control in the Definitions and Examples of Monolingual Dictionaries, [in:] R. Gouws, U. Heid, W. Schweickard, H.E. Wiegand (eds.), Wörterbücher Dictionaries Dictionnaires. An International Encyclopedia of Lexicography, 3 vols, Berlin: De Gruyter Mouton, pp. 899–904.

Rundell M. (1995), The word on the street, “English Today”, no. 11.3, pp. 29–35.

Sinclair J. (ed.) (1987), Collins COBUILD English language dictionary, London: Collins. Sinclair J. (ed.) (1995), Collins COBUILD Dictionary of Idioms, Birmingham: HarperCollins.

Svensén B. (1993), Practical Lexicography: Principles and Methods of Dictionary-Making, Oxford : Oxford University Press.

Svensén B. (2009), A Handbook of Lexicography. The Theory and Practice of Dictionary-Mak- ing, Cambridge: Cambridge University Press.

Takaie H. (2002), A Trap in Corpus Linguistics: The Gap between Corpus-based Analysis and Intuition-based Analysis, “English Corpus Linguistics in Japan”, no. 38, pp. 111–130.

Velychko A. (2025), Lexicographic issues in compiling bilingual learner dictionaries of idioms. Part I: Selecting Polish equivalents for English weather idioms, “Językoznawstwo”, 1/22, pp. 17–31, https://doi.org/10.25312/j.8943

Waiter E. (1998), Cambridge International Dictionary of Idioms, Cambridge: Cambridge University Press.

Yong H., Peng J. (2007), Bilingual Lexicography from a Communicative Perspective, Am- sterdam–Philadelphia: Benjamins Pub, https://doi.org/10.1075/tlrp.9

Zoltán I.G. (2013), “It’s Raining Cats and Dogs” – Weather in English Idioms, “Studia Uni- versitatis Petru Maior”, no. 14, pp. 270–277.

Żołnowska I. (2011), Weather as the source domain for metaphorical expressions, “AVANT. The Journal of the Philosophical-Interdisciplinary Vanguard”, no. 2, pp. 165–179.


Internet sources

Bab.la (n.d.), https://en.bab.la/ [accessed: 5.11.2023].

English-Corpora (n.d.), https://www.english-corpora.org/ [accessed: 7.11.2023]. Etutor (n.d.), https://www.etutor.pl/ [accessed: 4.11.2023].

Lexico (n.d.), https://www.lexico.com [accessed: 4.11.2023].

NKJP – Narodowy Korpus Języka Polskiego (n.d.), http://nkjp.pl/ [accessed: 7.11.2023].

Oxford Learner’s Dictionaries website (n.d.), https://www.oxfordlearnersdictionaries.com/ [accessed: 7.11.2023].

Oxford Text Checker (n.d.), https://www.oxfordlearnersdictionaries.com/text-checker/ [ac- cessed: 7.11.2023].

ReversoContext (n.d.), https://context.reverso.net/translation/ [accessed: 7.11.2023]. Sketch Engine (n.d.), https://app.sketchengine.eu [accessed: 7.11.2023].

The Cambridge Dictionary (n.d.), https://dictionary.cambridge.org/ [accessed: 4.11.2023].

The Merriam-Webster Dictionary (n.d.), https://www.merriam-webster.com/ [accessed: 7.11.2023].


Ten utwór jest dostępny na licencji Creative Commons Uznanie autorstwa-Na tych samych warunkach 4.0 Międzynarodowe.