[perlquestion / 702071] map versus for

I was contemplating over the 256 commandments from Damian’s "Perl Best Practices" when I encountered:

Being more used to other programming languages I often use the non-Perl approach to get the job done. I would typically use a for and probably not even consider the map. So this was an eye-opener for me. I decided to do a little test to see how much the difference is between the two.

(FYI: Perl v5.8.8 built for MSWin32-x86-multi-thread running on a Dell INSPIRON 9400)

use strict;
use warnings;
use Benchmark qw(:all);

my @results;
my $count = -5;

# Populate list with 10 mio numbers
for (my $i=0; $i<1000_000; $i++) {
    push @results, $i;
}

cmpthese (
    $count,
    {
        for => "test_for;",
        map => "test_map;",
    }
    );

timethese($count, {
       for => "test_for;",
        map => "test_map;",
                  }
);


sub test_for {
    my @sqrt_results;
    for my $result (@results) {
        push @sqrt_results , sqrt($result);
    }
}

sub test_map {
    my @sqrt_results = map { sqrt $_ } @results;
}
[download]

$count=-1
(warning: too few iterations for a reliable count)
      Rate  for  map
for 2.67/s   -- -10%
map 2.98/s  12%   --

$count=-5
      Rate  map  for
map 3.05/s   -- -16%
for 3.61/s  18%   --

$count=-10
      Rate for map
for 2.73/s  -- -8%
map 2.95/s  8%  --
[download]

$count=-1

Benchmark: running for, map for at least 1 CPU seconds...
       for:  1 wallclock secs ( 1.16 usr +  0.00 sys =  1.16 CPU) @  3.46/s (n=4)
       map:  2 wallclock secs ( 1.19 usr +  0.00 sys =  1.19 CPU) @  3.37/s (n=4)

$count=-5
Benchmark: running for, map for at least 5 CPU seconds...
       for:  6 wallclock secs ( 5.22 usr +  0.00 sys =  5.22 CPU) @  3.64/s (n=19)
       map:  6 wallclock secs ( 5.22 usr +  0.00 sys =  5.22 CPU) @  3.45/s (n=18)

$count=-10
Benchmark: running for, map for at least 10 CPU seconds...
       for: 10 wallclock secs (10.11 usr +  0.00 sys = 10.11 CPU) @  3.46/s (n=35)
       map: 10 wallclock secs (10.03 usr +  0.00 sys = 10.03 CPU) @  3.29/s (n=33)
[download]

Am I missing something? Is the example given by Damian a poor example? Should I really favor map over for when I want to generate a new list from another list?

Beside the obvious advantages: less code, easier to understand, it is stated that map is normally considerably faster.

Comment on map versus for
Select or Download Code — Send private /msg to dHarry

Re: map versus for [id://702075]
by philipbailey (Sexton) on Aug 04, 2008 at 17:52 EEST

++ -- +=0

I don't have Perl Best Practices in front of me, but I suspect the point that Damian Conway is making is that "map" is more idiomatic, in Perl, than a "for" loop, and not necessarily that it is more performant.

[reply]
[/msg]

Re: map versus for [id://702076]
by dreadpiratepeter (Curate) on Aug 04, 2008 at 17:53 EEST

++ -- +=0

I don't have my best practices in front of me, but....
I don't believe that the tip is for performance, but more for readability and clarity. As a general rule, map is better when doing a transformation, for is better when doing an iteration.
Or more simply, use map when generating a result, use for otherwise.

-pete
"Worry is like a rocking chair. It gives you something to do, but it doesn't get you anywhere."

[reply]
[/msg]

Re: map versus for [id://702079]
by dragonchild (Archbishop) on Aug 04, 2008 at 18:06 EEST

++ -- +=0

map can be faster because it's theoretically parallelizable, unlike for which is (generally) not.

The bigger point, though, is that by using map, you're telling me more about your intent with the code. map says "I'm doing something to each element, something that's probably easily described, and accumulating the result." On the other hand, for says "I'm doing something with each element and it could be anything."

My criteria for good software:

Does it work?
Can someone else come in, make a change, and be reasonably certain no bugs were introduced?

[reply]
[/msg]

Re^2: map versus for

[id://702083]

by Fletch (Chancellor) on Aug 04, 2008 at 18:12 EEST

++ -- +=0

That's pretty much the distinction I'd make as well, although I usually phrase it another way: for is for generic iteration, map is specifically a transformation.

A surefire way to annoy me and lose points when we get sample code is people who for whatever reason use map in void context rather than a proper for loop (it doesn't say "LOOK I R IDIOMATIC CODERZ", it says "ITERATION: UR DOIN IT WRONG" (and yes, I do have a lolcat based scale for applicants :)).

The cake is a lie.
The cake is a lie.
The cake is a lie.

[reply]
[/msg]

Re^3: map versus for

[id://702166]

by actualize (Beadle) on Aug 04, 2008 at 23:12 EEST

++ -- +=0

I have done some research on best practices regarding map. I wanted to ask at one point why we don't just use map instead of foreach. However I found a few nodes about that subject where, as you just did, map in a null context was brought up. However, I am not sure I understand what that means. Could you describe what map in a null context is?

-Actualize

[reply]
[/msg]

Re^4: map versus for

[id://702176]

by Fletch (Chancellor) on Aug 04, 2008 at 23:50 EEST

++ -- +=0

Using map but not capturing the returned values:

map { something( $_ ) } @somelist;

In recent versions that's been optimized (basically the return values are silently discarded rather than a temporary list built and then discarded later) so it's not as blecherous performance wise.

However it really buys you nothing to use it instead of a for loop here, because you've now muddled the conceptual waters (Was it at one point using the returned values and changed? Did they plan on possibly using them at some point?) and makes the code harder to understand (rather than the important thing (what's being iterated over) being up front, you've got to read past the details (what's being done for each item) to find out). It's along the lines of using passive or active voice in a sentence ("The cow jumped over the moon." vs "The moon was jumped over by the cow"); using the wrong one can shift what the reader takes as the emphasis to the wrong part.

The cake is a lie.
The cake is a lie.
The cake is a lie.

[reply]
[/msg]
[d/l]

Re^5: map versus for

[id://702179]

by actualize (Beadle) on Aug 05, 2008 at 00:09 EEST

++ -- +=0

I think I get it: Why use a gun to kill a cockroach when you have a perfectly good shoe. Since you don't get the benefits of mapping the data, and as such you are wasting time and space, it's better that you use "for".

So, if I am understanding you correctly,

for: used for iterating an action over a series of items.

map: used when one needs to create create a table of data which will be used at a future time.

-Actualize

[reply]
[/msg]

Re: map versus for [id://702085]
by pc88mxer (Vicar) on Aug 04, 2008 at 18:21 EEST

++ -- +=0

I often will use for instead of map for generating lists, especially when I am in the process of developing the code. Once I've got things figured out I might go back and re-code the loop as a map.

Using for has the following advantages:

you have more control and options over loop execution (last, next, etc.)

you can use your own more descriptive named lexical instead of $_

it's more readable (especially for non-perl experts)

If the list-generation logic is just a simple transformation, I'll just opt for a map implementation. Once, however, the logic becomes more complex, an explicit for loop begins to look more attractive. For instance, which of the following do you find easier to understand?

my @result = map { f($_) ? g($_) : () } @list;

# or:
my @result;
for (@list) {
  push(@result, g($_)) if (f($_));
}

[reply]
[/msg]
[d/l]
[select]

Re^2: map versus for

[id://702087]

by Fletch (Chancellor) on Aug 04, 2008 at 18:28 EEST

++ -- +=0

my @result = map { g($_) } grep { f($_) } @list;

But that's just me . . .

The cake is a lie.
The cake is a lie.
The cake is a lie.

[reply]
[/msg]
[d/l]

Re^2: map versus for

[id://702088]

by dreadpiratepeter (Curate) on Aug 04, 2008 at 18:28 EEST

++ -- +=0

Oh definately, because once you put more than one statement into a map, you have violated a bigger stylistic rule.
In that case, I will either pull the logic into a subroutine and call it from the map (assuming that I may need to reuse it), or switch to a for loop.
actually, I will switch to a foreach loop. I find that always using foreach for the

for $var (@list)

form and for for the c-style form adds to the grokiness of my code.
UPDATE: should have looked closer at the body of the for there, I agree with Fletch on that one

-pete
"Worry is like a rocking chair. It gives you something to do, but it doesn't get you anywhere."

[reply]
[/msg]

Re: map versus for [id://702096]
by toolic (Curate) on Aug 04, 2008 at 18:50 EEST

++ -- +=0

I do have PBP in front of me, and the focus is definitely on style, rather than performance. Perhaps there are some applications for which map is significantly faster than for, but, judging by your experiment, the code example in the book seems not to be one of them.

A related discussion (with more links and examples) is Map: The Basics in the Tutorials section.

[reply]
[/msg]

Re: map versus for [id://702123]
by MidLifeXis (Chaplain) on Aug 04, 2008 at 20:06 EEST

++ -- +=0

I have not done much with perl profiling, but when doing a performance comparison, don't you need to return the same results? It looks like the test_for returns the number of elements in the post-push array, whereas test_map returns the new array, at least if I am reading push correctly.

Update:Note that I am not saying that the results will change much. In fact, here are mine for 60 seconds. test_for2 does a return of the array at the end of the test function.

     s/iter for2  map  for
for2   2.70   --  -1%  -1%
map    2.68   1%   --  -0%
for    2.67   1%   0%   --
Benchmark: running for, for2, map for at least 60 CPU seconds...
       for: 61 wallclock secs (61.37 usr +  0.02 sys = 61.39 CPU) @  0.37/s (n=23)
      for2: 61 wallclock secs (61.37 usr +  0.03 sys = 61.40 CPU) @  0.37/s (n=23)
       map: 62 wallclock secs (61.48 usr +  0.03 sys = 61.51 CPU) @  0.37/s (n=23)

Update 2: Would some kind monk be willing to comment on if it is sufficient to just define the raw function (as in the OP), or would you also need to have the function return into a context of some sort. In other words, should there be another layer of function call here to force list context to make this a valid comparison?

--MidLifeXis

[reply]
[/msg]
[d/l]
[select]

Re: map versus for [id://702124]
by actualize (Beadle) on Aug 04, 2008 at 20:13 EEST

++ -- +=0

The speed increase involves minimizing resource intensive tasks such as sorting. Instead of of running code every time you iterate over a loop, you perform a map on the data once storing the output into an array. Then you can use the information from the array. The classic example would be the Schwarzian transform .

-Actualize

[reply]
[/msg]

Re: map versus for [id://702157]
by ikegami (Archbishop) on Aug 04, 2008 at 22:32 EEST

++ -- +=0

Best practices rarely have anything to do with performance. They are usually designed to avoid pitfalls or to increase readability and maintainability, often at the cost of performance.

I think your assumption that Damian recommended map for performance reason is flawed, or did he say as much?

[reply]
[/msg]
[d/l]

muci	muci
Think about Loose Coupling	muci
	PerlMonks