Keywords: large language models, semantic linking, table metadata
Abstract: Data augmentation is essential for matching table metadata, such as column names, to a knowledge graph without accessing the table's data or content. Previous works used large language models (LLMs) to enrich each column name into a single-sentence description but did not consider the entire table header. In this work, we propose a two-stage LLM-based process for column description generation, leveraging all available table metadata. The results highlight the importance of table headers for a broader context in data augmentation, with an 11–30\% improvement of Hit@k in table metadata matching across two datasets.
Submission Number: 8
Loading