duktape/bugs/issue-144bbcf65f2a145246ade...


								--- !ditz.rubyforge.org,2008-03-06/issue

								title: improve Unicode string lookup performance with an offset table

								desc: |-

								  Unicode string performance could be improved by adding some speedup data

								  after the actual string data.  This can be done when interning the string,

								  and string heaphdr flags can indicate which kind of speedup data is present.


								  For example, ony could add an index giving byte offsets for character

								  offsets 0, 256, 512, ....  Support for using the index could be added to

								  duk_heap_strcache_offset_char2byte(), and it would be easy to make the

								  index feature a compile time option.


								  A smaller footprint index would list the number of bytes used to encode

								  each chunk (e.g. 256) of characters.  This would not allow random access

								  but would still be faster than raw seeking; the index would also be smaller

								  than in a plain byte offset index.


								  It's not clear that any sort of a speedup index is actually needed, so it

								  would be nice to implement one as an option.  The upside of an index, even

								  a sparse one, is that it would give a reasonable upper bound on the time

								  it takes to access random characters.  Currently there is no upper bound:

								  if a string is 10 megabytes in size, it may require a straight ~5 megabyte

								  scan to find a character.

								type: :task

								component: duk

								release:

								reporter: sva <sami.vaarala@iki.fi>

								status: :unstarted

								disposition:

								creation_time: 2013-03-03 09:51:16.369359 Z

								references: []


								id: 144bbcf65f2a145246adefbc8d071288d2e91248

								log_events:

								- - 2013-03-03 09:51:16.694362 Z

								  - sva <sami.vaarala@iki.fi>

								  - created

								  - ""