... action detection and tracking, and one segment of the text In order to annotate unambiguously, we need to detect and track both landmarks and actions A landmark is a hallway or a door, and an action ... for each episode For example, in ”go straight and make the first left you can, then go into the first door on the right side and stop” , LEFT and FIRST occur exactly once for the first action, and ... ”room”, ”doorway” and their plural forms map to DOOR, and the ordinal number will be represented by ”first” and ”1st”, and so on System 2: Baseline Suppose we have clean data and there is no need...