Uncategorized

Revisiting Automated Topic Model Evaluation with Large Language Models


@inproceedings{stammbach-etal-2023-revisiting,
    title = "Revisiting Automated Topic Model Evaluation with Large Language Models",
    author = "Stammbach, Dominik  and
      Zouhar, Vil{\'e}m  and
      Hoyle, Alexander  and
      Sachan, Mrinmaya  and
      Ash, Elliott",
    editor = "Bouamor, Houda  and
      Pino, Juan  and
      Bali, Kalika",
    booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",
    month = dec,
    year = "2023",
    address = "Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.emnlp-main.581",
    doi = "10.18653/v1/2023.emnlp-main.581",
    pages = "9348--9357",
    abstract = "Topic models help us make sense of large text collections. Automatically evaluating their output and determining the optimal number of topics are both longstanding challenges, with no effective automated solutions to date. This paper proposes using large language models (LLMs) for these tasks. We find that LLMs appropriately assess the resulting topics, correlating more strongly with human judgments than existing automated metrics. However, the setup of the evaluation task is crucial {---} LLMs perform better on coherence ratings of word sets than on intrustion detection. We find that LLMs can also assist us in guiding us towards a reasonable number of topics. In actual applications, topic models are typically used to answer a research question related to a collection of texts. We can incorporate this research question in the prompt to the LLM, which helps estimating the optimal number of topics.",
}
<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="stammbach-etal-2023-revisiting">
    <titleInfo>
        <title>Revisiting Automated Topic Model Evaluation with Large Language Models</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Dominik</namePart>
        <namePart type="family">Stammbach</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Vilém</namePart>
        <namePart type="family">Zouhar</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Alexander</namePart>
        <namePart type="family">Hoyle</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Mrinmaya</namePart>
        <namePart type="family">Sachan</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Elliott</namePart>
        <namePart type="family">Ash</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2023-12</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Houda</namePart>
            <namePart type="family">Bouamor</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Juan</namePart>
            <namePart type="family">Pino</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Kalika</namePart>
            <namePart type="family">Bali</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Singapore</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>Topic models help us make sense of large text collections. Automatically evaluating their output and determining the optimal number of topics are both longstanding challenges, with no effective automated solutions to date. This paper proposes using large language models (LLMs) for these tasks. We find that LLMs appropriately assess the resulting topics, correlating more strongly with human judgments than existing automated metrics. However, the setup of the evaluation task is crucial — LLMs perform better on coherence ratings of word sets than on intrustion detection. We find that LLMs can also assist us in guiding us towards a reasonable number of topics. In actual applications, topic models are typically used to answer a research question related to a collection of texts. We can incorporate this research question in the prompt to the LLM, which helps estimating the optimal number of topics.</abstract>
    <identifier type="citekey">stammbach-etal-2023-revisiting</identifier>
    <identifier type="doi">10.18653/v1/2023.emnlp-main.581</identifier>
    <location>
        <url>https://aclanthology.org/2023.emnlp-main.581</url>
    </location>
    <part>
        <date>2023-12</date>
        <extent unit="page">
            <start>9348</start>
            <end>9357</end>
        </extent>
    </part>
</mods>
</modsCollection>
%0 Conference Proceedings
%T Revisiting Automated Topic Model Evaluation with Large Language Models
%A Stammbach, Dominik
%A Zouhar, Vilém
%A Hoyle, Alexander
%A Sachan, Mrinmaya
%A Ash, Elliott
%Y Bouamor, Houda
%Y Pino, Juan
%Y Bali, Kalika
%S Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
%D 2023
%8 December
%I Association for Computational Linguistics
%C Singapore
%F stammbach-etal-2023-revisiting
%X Topic models help us make sense of large text collections. Automatically evaluating their output and determining the optimal number of topics are both longstanding challenges, with no effective automated solutions to date. This paper proposes using large language models (LLMs) for these tasks. We find that LLMs appropriately assess the resulting topics, correlating more strongly with human judgments than existing automated metrics. However, the setup of the evaluation task is crucial — LLMs perform better on coherence ratings of word sets than on intrustion detection. We find that LLMs can also assist us in guiding us towards a reasonable number of topics. In actual applications, topic models are typically used to answer a research question related to a collection of texts. We can incorporate this research question in the prompt to the LLM, which helps estimating the optimal number of topics.
%R 10.18653/v1/2023.emnlp-main.581
%U https://aclanthology.org/2023.emnlp-main.581
%U https://doi.org/10.18653/v1/2023.emnlp-main.581
%P 9348-9357

Markdown (Informal)

[Revisiting Automated Topic Model Evaluation with Large Language Models](https://aclanthology.org/2023.emnlp-main.581) (Stammbach et al., EMNLP 2023)

ACL



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *