Pokology - a community-driven site around GNU poke

_____ ---' __\_______ ______)

poke maps

__) __) ---._______) Table of Contents _________________ 1. Editing data using variables 2. Maps and map-files 3. Loading maps 4. Multiple perspectives of the same data 5. Auto-map 6. Creating and managing maps on the fly 7. Predefined maps 1 Editing data using variables ============================== Editing data with GNU poke mainly involves creating mapped values and storing them in Poke variables. However, this may not be that convenient when poking several files simultaneously, and when the complexity of the data increases. For example, if we were interested in altering the fields of the header in an ELF file, we would map an Elf64_Ehdr struct at the beginning of the underlying IO space (the file), like in: ,---- | (poke) .file foo.o | (poke) load elf | (poke) var ehdr = Elf64_Ehdr @ 0#B `---- At this point the variable ehdr holds an Elf64_Ehdr structure, which is mapped. As such, altering any of the fields of the struct will update the corresponding bytes in foo.o. For example: ,---- | (poke) ehdr.e_entry = 0#B `---- A Poke value has three mapping related attributes: whether it is mapped, the offset at which it is mapped in an IO space, and in which IO space. This information is accessible for both the user and Poke programs using the following attributes: ,---- | (poke) ehdr'mapped | 1 | (poke) ehdr'offset | 0UL#b | (poke) ehdr'ios | 0 `---- Thats it, ehdr is mapped at offset zero byte in the IO space #0, which corresponds to foo.o: ,---- | (poke) .info ios | Id Type Mode Size Name | * #0 FILE rw 0x000004c8#B ./foo.o `---- Now that we have the ELF header, we may use it to get access to the ELF section header table in the file, that we will reference using another variable shdr: ,---- | (poke) var shdr = Elf64_Shdr[ehdr.e_shnum] @ ehdr.e_shoff | (poke) shdr[1] | Elf64_Shdr { | sh_name=0x1bU#B, | sh_type=0x1U, | sh_flags=#<ALLOC,EXECINSTR>, | sh_addr=0x0UL#B, | sh_offset=0x40UL#B, | sh_size=0xbUL#B, | sh_link=0x0U, | sh_info=0x0U, | sh_addralign=0x1UL, | sh_entsize=0x0UL#b | } `---- Variables are convenient entities to manipulate in Poke. Let's suppose that the file has a lot of sections and we want to do some transformation in every section. It is a time consuming operation, and we may forget which sections we have already processed and which not. We could create an empty array to hold the sections already processed: ,---- | (poke) var processed = Elf64_Shdr[] () `---- And then, once we have processed some given section, add it to the array: ,---- | ... edit shdr[23] ... | (poke) processed += [shdr[23]] `---- Note how the array processed is not mapped, but the sections contained in it are mapped: Poke uses copy by shared value. So, after we spend the day carefully poking our ELF file, we can ask poke, are we done with all the sections in the file? ,---- | (poke) shdr'length == processed'length | 1 `---- Yes, we are. This can be made as sophisticated as desired. We could easily write a function that saves the contents of processed in files, so we can continue hacking tomorrow, for example. We can then concluding that using mapped variables to edit data structures stored in IO spaces works well in common and simple cases like the above: we make our ways mapping here and there, defining variables to hold data that interests us, and it is easy to remember that the variables ehdr and shdr are mapped, where are they mapped, and that they are mapped in the file foo.o. However, GNU poke allows to edit more than one IO space simultaneously. Let's say we now want to poke the sections of another ELF file: bar.o. We would start by opening the file: ,---- | (poke) .file bar.o | (poke) .info ios | Id Type Mode Size Name | * #1 FILE rw 0x000004c8#B ./bar.o | #0 FILE rw 0x000004c8#B ./foo.o `---- Now that bar.o is the current IO space, we can map its header. But now, what variable to use? We would rather not redefine ehdr, because that is already holding the header of foo.o. We could adapt our naming schema on the fly: ,---- | (poke) var foo_ehdr = ehdr | (poke) var bar_ehdr = Elf64_Ehdr @ 0#B `---- But then we would need to do the same for the other variables too: ,---- | (poke) var foo_shdr = shdr | (poke) var bar_shdr = Elf64_Shdr[bar_ehdr.e_shnum] @ bar_ehdr.e_shoff `---- However, we can easily see how this can degenerate quickly: what about processed, for example? In general, as the number of IO spaces being edited increases it becomes more and more difficult to manage our mapped variables, which are associated to each IO space. 2 Maps and map-files ==================== As we have seen mapping variables is a very powerful, general and flexible mean to edit stored binary data in one or more IO spaces. However it is easy to lose track of where the variables are mapped and, ideally speaking, we would want to have a mean to refer to, say, the "ELF header", and get the header as a mapped value regardless of what specific file we are editing. Sort of a "meta variable". GNU poke provides a way to do this: "maps". A "map" can be conceived as a sort of "view" that can be applied to a given IO space. Maps have entries, which are values mapped at some given offset, under certain conditions. For example, we have seen an ELF file contains, among other things, a header at the beginning of the file and a table of section headers of certain size and located at certain location determined by the header. These would be two entries of a so-called ELF map. poke maps are defined in "map files". These files use the .map extension. A map file self.map (for sectioned/simple elf) defining the view of an ELF file as a header and a table of section header would look like this: ,---- | /* self.map - map file for a simplified view of an ELF file. */ | | load elf; | | %% | | %entry | %name ehdr | %type Elf64_Ehdr | %offset 0#B | | %entry | %name shdr | %type Elf64_Shdr[(Elf64_Ehdr @ 0#B).e_shnum] | %condition (Elf64_Ehdr @ 0#B).e_shnum > 0 | %offset (Elf64_Ehdr @ 0#B).e_shoff `---- This map file defines a view of an ELF file as a header entry ehdr and an entry with a table of section headers shdr. The first section of the file, which spans until the separator line containing %%, is arbitrary Poke code which as we shall see, gets evaluated before the map entries are processed. This is called the map "prologue". In this case, the prologue contains a comment explaining the purpose of the file, and a single statement load that loads the elf.pk pickle, since the entries below use definitions like Elf64_Ehdr that are defined by that pickle. The prologue is useful to define Poke functions and other entities that are then used in the definitions of the entries. A separator line containing only %% separates the prologue from the next section, which is a list of entries definitions. Each entry definition starts with a line %entry, and has the following attributes: - A %name, like ehdr and shdr. These names should follow the same rules than Poke variables, but as we shall see later, map entries are not Poke variables. This attribute is mandatory. - A %type. This can be any Poke expression denoting a type, like int, Elf64_Ehdr or Elf64_Shdr[(Elf64_Ehdr @ 0#B).e_shnum]. This attribute is mandatory. - A %condition, if specified, will determine whether to include the entry in the map. In the example above, the map will have an entry shdr only if the ELF file has one or more sections. Any Poke expression evaluating to a boolean can be used as conditions. This attribute is optional: entries not having a condition will always be included in the map. - An %offset in the IO space, where the entry will be mapped. Any Poke expression evaluating to an offset can be used as entry offset. This attribute is mandatory. 3 Loading maps ============== So we have written our self.map, which denotes a view or structure of ELF files we are interested on, and that resides in the current working directory. How to use it? The first step is to fire up poke and open some object file. Let's start with foo.o: ,---- | (poke) .file foo.o `---- Now, we can load the map using the .map load dot-command: ,---- | (poke) .map load self | [self](poke) `---- The .map load self command makes poke to look in certain directories for a file called self.map, and to load it. The list of directories where poke looks for map files is encoded in the variable map_load_path as a string containing a maybe empty list of directories separated by : characters. Each directory is tried in turn. This variable is initialized with suitable defaults: ,---- | (poke) map_load_path | "/home/jemarch/.poke.d:.:/home/jemarch/.local/share/poke:/home/jemarch/gnu/hacks/poke/maps" `---- Once a map is loaded, observe how the prompt changed to contain a prefix [self]. This means that the map self is loaded for the current IO space. You can choose to not see this information in the prompt by setting the prompt-maps option either at the prompt or in your .pokerc: ,---- | (poke) .set prompt-maps no `---- By default prompt-maps is yes. This prompt aid is intended to provide a cursory look of the "views" or maps loaded for the current IO space. If we load another IO space and switch to it, the prompt changes accordingly: ,---- | (poke) [self](poke) .mem foo | The current IOS is now `*foo*'. | (poke) .ios #0 | The current IOS is now `./foo.o'. | [self](poke) `---- At any time the .info maps dot-command can be used to obtain a full list of loaded maps, with more information about them: ,---- | (poke) .info maps | IOS Name Source | #0 self ./self.map `---- In this case, there is a map self loaded in the IO space #0, which corresponds to foo.o. Once we make foo.o our current IO space, we can ask poke to show us the entries corresponding to this map using another dot-command: ,---- | (poke) .map show self | Offset Entry | 0x0UL#B $self::ehdr | 0x208UL#B $self::shdr `---- This tells us there are two entries for self in foo.o: $self::ehdr and $self::shdr. Note how map entries use names that start with the $ character, then contain the name of the map an the name of the entry we defined in the map file, separated by ::. We can now use these entries at the prompt like if they were regular mapped variables: ,---- | [self](poke) $self::ehdr | Elf64_Ehdr { | e_ident=struct { | ei_mag=[0x7fUB,0x45UB,0x4cUB,0x46UB], | [...] | }, | e_type=0x1UH, | e_machine=0x3eUH, | [...] | } | (poke) $self::shdr'length | 11UL `---- It is important to note, however, that map entries like $foo::bar are *not* part of the Poke language, and are only available when using poke interactively. Poke programs and scripts can't use them. Let's now open another ELF file, and the self map in it: ,---- | (poke) .file /usr/local/lib/libpoke.so.0.0.0 | (poke) .map load self | [self](poke) `---- So now we have two ELF files loaded in poke: foo.o and libpoke.so.0.0.0, and in both IO spaces we have the self map loaded. We can easily see that the map entries are different depending on the current IO space: ,---- | [self](poke) .map show self | Offset Entry | 0UL#B $self::ehdr | 3158952UL#B $self::shdr | [self](poke) .ios #0 | The current IOS is now `./foo.o'. | [self](poke) .map show self | Offset Entry | 0UL#B $self::ehdr | 520UL#B $self::shdr `---- foo.o is an object file, whereas libpoke.so.0.0.0 is a DSO: ,---- | (poke) .ios #0 | The current IOS is now `./foo.o'. | [self](poke) $self::ehdr.e_type | 1UH | [self](poke) .ios #2 | The current IOS is now `/usr/local/lib/libpoke.so.0.0.0'. | [self](poke) $self::ehdr.e_type | 3UH `---- The interpretation of the map entry $self::ehdr is different depending on the current IO space. This makes it possible to refer to the "ELF header" of the current file. Underneath, poke implements this by defining mapped variables and "redirecting" the entry names $foo::bar to the right variable depending on the IO space that is currently selected. It hides all that complexity from us. 4 Multiple perspectives of the same data ======================================== It is perfectly possible (and useful!) to load more than one map in the same IO space. It is very natural for a single file, for example, to contain data that can be interpreted in several ways, or of different nature. Let's for example open again an ELF file, this time compiled with -g: ,---- | (poke) .file foo.o `---- We now load our self map, to get a view of the file as a collection of sections: ,---- | (poke) .map load self | [self](poke) `---- And now we load the dwarf map that comes with poke, to get a view of the file as having debugging information encoded in DWARF: ,---- | [self(poke) .map load dwarf | [dwarf,self](poke) `---- See how the prompt now reflects the fact that the current IO space contains DWARF info! Let's take a look: ,---- | [dwarf,self](poke) .info maps | IOS Name Source | #0 dwarf /home/jemarch/gnu/hacks/poke/maps/dwarf.map | #0 self ./self.map | [dwarf,self](poke) .map show dwarf | Offset Entry | 0x5bUL#B $dwarf::info `---- Now we can access entries from any of the loaded maps, i.e. access the file in terms of different perspectives. As an ELF file: ,---- | [dwarf,self](poke) $self::shdr[1] | Elf64_Shdr { | sh_name=0xb5U#B, | sh_type=0x11U, | sh_flags=#<>, | sh_addr=0x0UL#B, | sh_offset=0x40UL#B, | sh_size=0x8UL#B, | sh_link=0x18U, | sh_info=0xfU, | sh_addralign=0x4UL, | sh_entsize=0x4UL#b | } `---- And as a file containing DWARF info: ,---- | [dwarf,self](poke) $dwarf::info | Dwarf_CU_Header { | unit_length=#<0x0000004eU#B>, | version=0x4UH, | debug_abbrev_offset=#<0x00000000U#B>, | address_size=0x8UB#B | } `---- If you are curious about how the DWARF entries are defined, look at maps/dwarf.map in the poke source distribution, or in your installed poke (.info maps will tell you the file the map got loaded from.) It is possible to unload or remove a map from a given IO space using the .map remove dot-command. Say we are done looking at the DWARF in foo.o, and we are no longer interested in it as a file containing debugging info. We can do: ,---- | [dwarf,self](poke) .map remove dwarf | [self](poke) `---- Note how the prompt was updated accordingly: only self remains as a loaded map on this file. 5 Auto-map ========== Certain maps make sense when editing certain types of data. For example, dwarf.map is intended to be used in ELF files. In order to ease using maps, poke provides a feature called "auto mapping", which is disabled by default. You can set auto mapping like this: ,---- | (poke) .set auto-map yes `---- When auto mapping is enabled, poke will look to the value of the pre-defined variable auto_map, which must contain an array of pairs of strings, associating a regular expression with a map name. For example, you may want to initialize auto_map like this in your .pokerc file: ,---- | auto_map = [[".*\\.mp3$", "mp3"], | [".*\\.o$", "elf"], | ["a\\.out$", "elf"]]; `---- This will make poke to load mp3.map for every file whose name ends with ".mp3", and elf.map for files having names like foo.o and a.out. Following the usual pokeish philosophy of being as less as intrusive by default as possible, the default value of auto_map is the empty string. 6 Creating and managing maps on the fly ======================================= As we have seen, we can define our own maps using map files like self.map, which contain a prologue and a set of map entries. However, sometimes it is useful to create maps "on the fly" while we explore some data with poke. To make this possible, poke provides a suitable set of dot-commands. Let's say we are poking some data, and we want to create a map for it. We can do that like this: ,---- | (poke) .map create mymap `---- This creates an empty map named mymap, with no entries: ,---- | [mymap](poke) .map show mymap | Offset Entry `---- Adding entries is easy. First, we have to map some variable, and then use it as the base for the new entry: ,---- | [mymap](poke) var foo = int[3] @ 0#B | [mymap](poke) .map entry add mymap, foo | [mymap](poke) .map show mymap | Offset Entry | 0x0UL#B $mymap::foo `---- Note how the entry $mymap::foo gets created, associated to the current IO space and mapped at the same offset than the variable foo. We can remove entries from existing maps using the .map entry remove dot-command: ,---- | [mymap](poke) .map entry remove mymap, foo | [mymap](poke) .map show mymap | Offset Entry | [mymap](poke) `---- We plan to add an additional command to save maps to map files. The idea is that you can create your maps on the fly, save them, and then load them back some other day when you are ready to continue poking. This is not implemented yet though. 7 Predefined maps ================= GNU poke comes with a set of useful pre-written maps, which get installed in a system location. We want to expand this collection, so please send us your map files!