--- !ditz.rubyforge.org,2008-03-06/issue title: improve Unicode string lookup performance with an offset table desc: |- Unicode string performance could be improved by adding some speedup data after the actual string data. This can be done when interning the string, and string heaphdr flags can indicate which kind of speedup data is present. For example, ony could add an index giving byte offsets for character offsets 0, 256, 512, .... Support for using the index could be added to duk_heap_strcache_offset_char2byte(), and it would be easy to make the index feature a compile time option. A smaller footprint index would list the number of bytes used to encode each chunk (e.g. 256) of characters. This would not allow random access but would still be faster than raw seeking; the index would also be smaller than in a plain byte offset index. It's not clear that any sort of a speedup index is actually needed, so it would be nice to implement one as an option. The upside of an index, even a sparse one, is that it would give a reasonable upper bound on the time it takes to access random characters. Currently there is no upper bound: if a string is 10 megabytes in size, it may require a straight ~5 megabyte scan to find a character. type: :task component: duk release: reporter: sva status: :unstarted disposition: creation_time: 2013-03-03 09:51:16.369359 Z references: [] id: 144bbcf65f2a145246adefbc8d071288d2e91248 log_events: - - 2013-03-03 09:51:16.694362 Z - sva - created - ""