GhostWire Studios - Flash/Flex UI Components Development And Consulting Services
Quality User Interface Controls For Flash Application DevelopmentAspireUI Components

Sep 07 2009

[AS3] Finding Occurrences Of A Sequence Of Bytes Within A ByteArray

Published by sunny at 11:05 am under Flash, Flash AS3, Tips

The following is a simple way to search within a ByteArray for specific patterns of bytes. At the end of this post is a ByteArrayUtils class that contains the indexOf(), indicesOf and lastIndexOf() methods discussed below. If you just want to grab the code and skip all the walk-through, please feel free to jump there.


Finding The First Occurrence Of The Pattern – indexOf()

  • We want to know the index position of the first occurrence of the pattern, if any.
  • We take the first byte of the sequence we are looking for, and scan through the target ByteArray byte-by-byte to find a match.
  • Once a match is found, we take the last byte of the sequence we are looking for, and check if the byte at the corresponding position in the target ByteArray also matches. If not, we have found a false candidate and therefore the bytes in-between need not be tested.
  • We continue working backwards until all the bytes in the sequence are tested, aborting if a comparison is false.
  • Once all the bytes in the sequence are tested positive, we have found an occurrence of the pattern.
public static function indexOf(target:ByteArray, pattern:*, fromIndex:int = 0):int
{
	var arr:Array, end:Boolean, found:Boolean, a:int, i:int, j:int, k:int;
 
	var toFind:ByteArray = toByteArray(pattern);
	if (toFind == null)
	{
		// ** type of pattern unsupported **
		throw new Error("Unsupported Pattern");
		return;
	}
 
	a	= toFind.length;
	j	= target.length - a;
 
	if (fromIndex < 0)
	{
		i = j + fromIndex;
		if (i < 0)
		{
			return -1;
		}
	}
	else
	{
		i = fromIndex;
	}
	while (!end)
	{
		if (target[i] == toFind[0])
		{
			// ** found a possible candidate **
			found	= true;
			k		= a;
			while (--k)
			{
				if (target[i + k] != toFind[k])
				{
					// ** doesn't match, false candidate **
					found = false;
					break;
				}
			}
			if (found)
			{
				return i;
			}
		}
		if (fromIndex < 0)
		{
			end = (--i < 0);
		}
		else
		{
			end = (++i > j);
		}
	}
	return -1;
}


Defining The Pattern
The indexOf() method has the following signature:

public static function indexOf(target:ByteArray, pattern:*, fromIndex:int = 0):int

If you have the sequence of bytes to find already defined as a ByteArray, you can use that for the pattern parameter (ie, you can look for a ByteArray within a ByteArray):

ByteArrayUtils.indexOf(target, sequence); // where sequence is a ByteArray object

Otherwise, you can define the sequence of bytes as an Array of byte values:

ByteArrayUtils.indexOf(target, [0x72, 0x69, 0x76, 0x76, 0x79]);

You can also define a String to look up:

ByteArrayUtils.indexOf(target, "HELLO");

The toByteArray() method helps to convert Array or String to ByteArray:

public static function toByteArray(obj:*):ByteArray
{
	var toFind:ByteArray;
	if (obj is ByteArray)
	{
		toFind = obj;
	}
	else
	{
		toFind	= new ByteArray();
		if (obj is Array)
		{
			// ** looking for a sequence of target **
			var i:int = obj.length;
			while (i--)
			{
				toFind[i] = obj[i];
			}
		}
		else if (obj is String)
		{
			// ** looking for a sequence of string characters **
			toFind.writeUTFBytes(obj);
		}
		else
		{
			return null;
		}
	}
	return toFind;
}


Specifying Where To Begin Searching – fromIndex

  • By default, we will scan from index position zero to the last valid index position (which is “length of target byte array less the length of sequence of bytes to find”).
  • To begin searching with an offset, specify the fromIndex parameter.
ByteArrayUtils.indexOf(target, "HELLO", 1000); // skip the first 1000 bytes


Finding Multiple Occurrences – indicesOf()

  • The indicesOf() method returns an array containing the index positions found. If no match is found, an empty array is returned.
var indices:Array = ByteArrayUtils.indicesOf(target, [0x72, 0x69, 0x76, 0x76, 0x79]);
public static function indicesOf(target:ByteArray, pattern:*, fromIndex:uint = 0):Array
{
	var arr:Array			= [];
	var toFind:ByteArray	= toByteArray(pattern);
	if (toFind == null)
	{
		// ** type of pattern unsupported **
		throw new Error("Unsupported Pattern");
		return;
	}
	var i:int			= indexOf(target, toFind, fromIndex);
	while (i != -1)
	{
		arr[arr.length] = i;
		i = indexOf(target, toFind, i + 1);
	}
	return arr;
}


Finding The Last Occurrence – lastIndexOf()

  • Use the lastIndexOf() method to find the last, instead of the first, occurrence of the pattern.
  • The indexOf() method has been coded to search backwards if the fromIndex parameter is a negative integer.
  • The lastIndexOf() therefore simply calls the indexOf() method passing -1 for the fromIndex parameter.


Testing By Unsigned Integers
I tried testing four bytes at a time, ie by comparing unsigned integers, but the performance ended up worse off. I have therefore removed that implementation. The following was the idea behind that implementation:

  • If the sequence we are looking for is more than five bytes, we will test for matches using unsigned integers (so four bytes at a time). If the sequence is five bytes or less, we stick to byte-by-byte scan because the first and last byte would have been tested (so left three or less bytes to test).
  • As long as there are four or more bytes left to test, unsigned integer comparison is used, otherwise byte value comparison is used.


ByteArrayUtils Class
ByteArrayUtils ActionScript 3.0 Class

ByteArrayUtils.zip

pixelstats trackingpixel
Share or Bookmark This Post:
  • StumbleUpon
  • email
  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • Live
  • Yahoo! Buzz
  • Netvibes
  • NewsVine
  • Reddit
  • Slashdot
  • Technorati
  • BlinkList
  • Mixx
  • Diigo
  • Faves
  • Suggest to Techmeme via Twitter
  • Twitter

Other Posts You Might Enjoy:

     

No responses yet

Comments RSS

Leave a Reply

*
To prove you're a person (not a spam script), type the security word shown in the picture. Click on the picture to hear an audio file of the word.
Click to hear an audio file of the anti-spam word