« January 2007 | Main | January 2008 »

February 9, 2007

Finding array intersections faster in Perl (improving on Perl Cookbook)

Before I start, some background is in order since the term intersection isn't necessarily a common word.

The intersection between two sets (lists, arrays, etc), say A = {1,2,3} and B = {2,3,4} (written A ∩ B) is the items that are common between the two. In this case, A ∩ B = {2,3}.

Some code I'm modifying uses an intersection to find the common items after a sequence of filters is applied (i.e. each filter removes non-matching items from the original list, until we're left with the elements that matched every filter). The implementation used is one recommended in the Perl Cookbook, recipe 4.8:

@a = (1,2,3);
@b = (2,3,4);
%isect = %union = ();
foreach $e (@a, @b) { $union{$e}++ && $isect{$e}++ }
@isect_list = keys %isect

This works quite well, but I didn't need to find the union, and it requires no duplicate items within a list (there is another version of the recipe that would work with dups, but still requires tracking the union to find the intersection).

I came up with a different recipe. This is specific for intersection and doesn't assume uniqueness in the lists:

@a = (1,2,3);
@b = (2,3,4);
%isect = ();
map { $isect{$_} = 1 } @a
@isect_list = grep { $isect{$_} } @b;

As a bonus, benchmarking this recipe vs the one above shows a 3x performance increase.

If you needed to find the union (all items present in either list), this is what it would look like:

@a = (1,2,3);
@b = (2,3,4);
%union = ();
map { $union{$_} = 1 } @a, @b;
@union_list = keys %union;

This also has a 3x performance increase over the recipe.

Posted by pete at 4:20 PM | Comments (1)

February 5, 2007

Several ways to find "end of yesterday" in MySQL

Today I needed to find the time for "the end of yesterday" (i.e. 11:59:59 pm yesterday). MySQL has a handy month-end function LAST_DAY() which "takes a date or datetime value and returns the corresponding value for the last day of the month." Doesn't look like there's an equivalent built-in function for end-of-day, so I came up with a few different ways:

I went with the last one, it seemed the most straight-forward. If there's a better way, I'd love to hear about it.

Posted by pete at 2:07 PM

golf tips