Poster: Scrabble: Converting Unstructured Metadata into Brick for Many Buildings


Buildings traditionally consist of vertically integrated subsystems installed by multiple vendors in different times without common understanding of the entire system. It results in unstructured metadata of thousands of data points, which third part vendors who seek to deploy applications like fault diagnosis need to map into a common schema. This mapping process requires deep domain expertise in both the schema and buildings with significant man-hours. Our framework, Scrabble, significantly reduces effort of mapping multiple buildings by introducing a two-stages active learning mechanism that exploits the structure present in a standard schema, Brick, and learns from buildings that have already been mapped to the schema. Scrabble maps characters of metadata into intermediate representation (IR) using conditional random fields and then to labels with a modified classifier chain. Introducing IR enables reusing the learned model for other buildings. Our model requires only minimal input from domain experts for mapping. We have evaluated Scrabble reduces 60 % of samples to achieve 95 % accuracy covering more labels with 2.54 times higher macro F1 at compared to a naive baseline.

Proceedings of the 4th ACM International Conference on Systems for Energy-Efficient Built Environments, 2017