« January 2007 | Main | January 2008 »
February 9, 2007
Finding array intersections faster in Perl (improving on Perl Cookbook)
Before I start, some background is in order since the term intersection isn't necessarily a common word.
The intersection between two sets (lists, arrays, etc), say A = {1,2,3} and B = {2,3,4} (written A ∩ B) is the items that are common between the two. In this case, A ∩ B = {2,3}.
Some code I'm modifying uses an intersection to find the common items after a sequence of filters is applied (i.e. each filter removes non-matching items from the original list, until we're left with the elements that matched every filter). The implementation used is one recommended in the Perl Cookbook, recipe 4.8:
@a = (1,2,3);
@b = (2,3,4);
%isect = %union = ();
foreach $e (@a, @b) { $union{$e}++ && $isect{$e}++ }
@isect_list = keys %isect
This works quite well, but I didn't need to find the union, and it requires no duplicate items within a list (there is another version of the recipe that would work with dups, but still requires tracking the union to find the intersection).
I came up with a different recipe. This is specific for intersection and doesn't assume uniqueness in the lists:
@a = (1,2,3);
@b = (2,3,4);
%isect = ();
map { $isect{$_} = 1 } @a
@isect_list = grep { $isect{$_} } @b;
As a bonus, benchmarking this recipe vs the one above shows a 3x performance increase.
If you needed to find the union (all items present in either list), this is what it would look like:
@a = (1,2,3);
@b = (2,3,4);
%union = ();
map { $union{$_} = 1 } @a, @b;
@union_list = keys %union;
This also has a 3x performance increase over the recipe.
Posted by pete at 4:20 PM | Comments (1)
February 5, 2007
Several ways to find "end of yesterday" in MySQL
Today I needed to find the time for "the end of yesterday" (i.e. 11:59:59 pm yesterday). MySQL has a handy month-end function LAST_DAY() which "takes a date or datetime value and returns the corresponding value for the last day of the month." Doesn't look like there's an equivalent built-in function for end-of-day, so I came up with a few different ways:
(now() - interval time_to_sec(time(now()))+1 second)concat(date(now() - interval 1 day),' 23:59:59')subtime(now(),time(now())) - interval 1 second
I went with the last one, it seemed the most straight-forward. If there's a better way, I'd love to hear about it.
Posted by pete at 2:07 PM