muci muci
Think about Loose Coupling
 
PerlMonks

map versus for

by dHarry (Scribe)
 | log AltBlue out | AltBlue | The Monastery Gates | Super Search | 
 | Seekers of Perl Wisdom | Meditations | PerlMonks Discussion | Snippets | 
 | Obfuscation | Reviews | Cool Uses For Perl | Perl News | Q&A | Tutorials | 
 | Code | Poetry | Recent Threads | Newest Nodes | Donate | What's New | 

on Aug 04, 2008 at 17:40 EEST ( #702071=perlquestion: print w/ replies, xml ) Need Help??
dHarry has asked for the wisdom of the Perl Monks concerning the following question:
       

Dear Monks

I was contemplating over the 256 commandments from Damian’s "Perl Best Practices" when I encountered:

List Generation

Use map instead of for when generating new lists from old.

Being more used to other programming languages I often use the non-Perl approach to get the job done. I would typically use a for and probably not even consider the map. So this was an eye-opener for me. I decided to do a little test to see how much the difference is between the two.

(FYI: Perl v5.8.8 built for MSWin32-x86-multi-thread running on a Dell INSPIRON 9400)

I use the example as mentioned by Damian and the Benchmark module to test:

use strict; use warnings; use Benchmark qw(:all); my @results; my $count = -5; # Populate list with 10 mio numbers for (my $i=0; $i<1000_000; $i++) { push @results, $i; } cmpthese ( $count, { for => "test_for;", map => "test_map;", } ); timethese($count, { for => "test_for;", map => "test_map;", } ); sub test_for { my @sqrt_results; for my $result (@results) { push @sqrt_results , sqrt($result); } } sub test_map { my @sqrt_results = map { sqrt $_ } @results; }

First the comparison:

$count=-1 (warning: too few iterations for a reliable count) Rate for map for 2.67/s -- -10% map 2.98/s 12% -- $count=-5 Rate map for map 3.05/s -- -16% for 3.61/s 18% -- $count=-10 Rate for map for 2.73/s -- -8% map 2.95/s 8% --

Hmmm, not really impressive this “gain” of using map over for?!

Next some timing:

$count=-1 Benchmark: running for, map for at least 1 CPU seconds... for: 1 wallclock secs ( 1.16 usr + 0.00 sys = 1.16 CPU) @ 3.46/s (n=4) map: 2 wallclock secs ( 1.19 usr + 0.00 sys = 1.19 CPU) @ 3.37/s (n=4) $count=-5 Benchmark: running for, map for at least 5 CPU seconds... for: 6 wallclock secs ( 5.22 usr + 0.00 sys = 5.22 CPU) @ 3.64/s (n=19) map: 6 wallclock secs ( 5.22 usr + 0.00 sys = 5.22 CPU) @ 3.45/s (n=18) $count=-10 Benchmark: running for, map for at least 10 CPU seconds... for: 10 wallclock secs (10.11 usr + 0.00 sys = 10.11 CPU) @ 3.46/s (n=35) map: 10 wallclock secs (10.03 usr + 0.00 sys = 10.03 CPU) @ 3.29/s (n=33)

Am I missing something? Is the example given by Damian a poor example? Should I really favor map over for when I want to generate a new list from another list?

Thanks upfront

Update

Beside the obvious advantages: less code, easier to understand, it is stated that map is normally considerably faster.

Comment on map versus for
Select or Download Code     Send private /msg to dHarry
Re: map versus for [id://702075]
by philipbailey (Sexton) on Aug 04, 2008 at 17:52 EEST
       
    I don't have Perl Best Practices in front of me, but I suspect the point that Damian Conway is making is that "map" is more idiomatic, in Perl, than a "for" loop, and not necessarily that it is more performant.
[reply]
[/msg]
Re: map versus for [id://702076]
by dreadpiratepeter (Curate) on Aug 04, 2008 at 17:53 EEST
       
    I don't have my best practices in front of me, but....
    I don't believe that the tip is for performance, but more for readability and clarity. As a general rule, map is better when doing a transformation, for is better when doing an iteration.
    Or more simply, use map when generating a result, use for otherwise.


    -pete
    "Worry is like a rocking chair. It gives you something to do, but it doesn't get you anywhere."
[reply]
[/msg]
Re: map versus for [id://702079]
by dragonchild (Archbishop) on Aug 04, 2008 at 18:06 EEST
       
    map can be faster because it's theoretically parallelizable, unlike for which is (generally) not.

    The bigger point, though, is that by using map, you're telling me more about your intent with the code. map says "I'm doing something to each element, something that's probably easily described, and accumulating the result." On the other hand, for says "I'm doing something with each element and it could be anything."


    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?

[reply]
[/msg]
       

      That's pretty much the distinction I'd make as well, although I usually phrase it another way: for is for generic iteration, map is specifically a transformation.

      A surefire way to annoy me and lose points when we get sample code is people who for whatever reason use map in void context rather than a proper for loop (it doesn't say "LOOK I R IDIOMATIC CODERZ", it says "ITERATION: UR DOIN IT WRONG" (and yes, I do have a lolcat based scale for applicants :)).

      The cake is a lie.
      The cake is a lie.
      The cake is a lie.

[reply]
[/msg]
       

        I have done some research on best practices regarding map. I wanted to ask at one point why we don't just use map instead of foreach. However I found a few nodes about that subject where, as you just did, map in a null context was brought up. However, I am not sure I understand what that means. Could you describe what map in a null context is?

        -Actualize
[reply]
[/msg]
       

          Using map but not capturing the returned values:

          map { something( $_ ) } @somelist;
          

          In recent versions that's been optimized (basically the return values are silently discarded rather than a temporary list built and then discarded later) so it's not as blecherous performance wise.

          However it really buys you nothing to use it instead of a for loop here, because you've now muddled the conceptual waters (Was it at one point using the returned values and changed? Did they plan on possibly using them at some point?) and makes the code harder to understand (rather than the important thing (what's being iterated over) being up front, you've got to read past the details (what's being done for each item) to find out). It's along the lines of using passive or active voice in a sentence ("The cow jumped over the moon." vs "The moon was jumped over by the cow"); using the wrong one can shift what the reader takes as the emphasis to the wrong part.

          The cake is a lie.
          The cake is a lie.
          The cake is a lie.

[reply]
[/msg]
[d/l]
       

            I think I get it: Why use a gun to kill a cockroach when you have a perfectly good shoe. Since you don't get the benefits of mapping the data, and as such you are wasting time and space, it's better that you use "for".

            So, if I am understanding you correctly,

          • for: used for iterating an action over a series of items.
          • map: used when one needs to create create a table of data which will be used at a future time.
          • -Actualize

[reply]
[/msg]
Re: map versus for [id://702085]
by pc88mxer (Vicar) on Aug 04, 2008 at 18:21 EEST
       
    I often will use for instead of map for generating lists, especially when I am in the process of developing the code. Once I've got things figured out I might go back and re-code the loop as a map.

    Using for has the following advantages:

    • you have more control and options over loop execution (last, next, etc.)
    • you can use your own more descriptive named lexical instead of $_
    • it's more readable (especially for non-perl experts)
    If the list-generation logic is just a simple transformation, I'll just opt for a map implementation. Once, however, the logic becomes more complex, an explicit for loop begins to look more attractive. For instance, which of the following do you find easier to understand?
    my @result = map { f($_) ? g($_) : () } @list;
    
    # or:
    my @result;
    for (@list) {
      push(@result, g($_)) if (f($_));
    }
    

[reply]
[/msg]
[d/l]
[select]
       
      my @result = map { g($_) } grep { f($_) } @list;
      

      But that's just me . . .

      The cake is a lie.
      The cake is a lie.
      The cake is a lie.

[reply]
[/msg]
[d/l]
       
      Oh definately, because once you put more than one statement into a map, you have violated a bigger stylistic rule.
      In that case, I will either pull the logic into a subroutine and call it from the map (assuming that I may need to reuse it), or switch to a for loop.
      actually, I will switch to a foreach loop. I find that always using foreach for the
      for $var (@list)
      form and for for the c-style form adds to the grokiness of my code.
      UPDATE: should have looked closer at the body of the for there, I agree with Fletch on that one


      -pete
      "Worry is like a rocking chair. It gives you something to do, but it doesn't get you anywhere."
[reply]
[/msg]
Re: map versus for [id://702096]
by toolic (Curate) on Aug 04, 2008 at 18:50 EEST
       
    I do have PBP in front of me, and the focus is definitely on style, rather than performance. Perhaps there are some applications for which map is significantly faster than for, but, judging by your experiment, the code example in the book seems not to be one of them.

    A related discussion (with more links and examples) is Map: The Basics in the Tutorials section.

[reply]
[/msg]
Re: map versus for [id://702123]
by MidLifeXis (Chaplain) on Aug 04, 2008 at 20:06 EEST
       

    I have not done much with perl profiling, but when doing a performance comparison, don't you need to return the same results? It looks like the test_for returns the number of elements in the post-push array, whereas test_map returns the new array, at least if I am reading push correctly.

    Update:Note that I am not saying that the results will change much. In fact, here are mine for 60 seconds. test_for2 does a return of the array at the end of the test function.

         s/iter for2  map  for
    for2   2.70   --  -1%  -1%
    map    2.68   1%   --  -0%
    for    2.67   1%   0%   --
    Benchmark: running for, for2, map for at least 60 CPU seconds...
           for: 61 wallclock secs (61.37 usr +  0.02 sys = 61.39 CPU) @  0.37/s (n=23)
          for2: 61 wallclock secs (61.37 usr +  0.03 sys = 61.40 CPU) @  0.37/s (n=23)
           map: 62 wallclock secs (61.48 usr +  0.03 sys = 61.51 CPU) @  0.37/s (n=23)
    

    Update 2: Would some kind monk be willing to comment on if it is sufficient to just define the raw function (as in the OP), or would you also need to have the function return into a context of some sort. In other words, should there be another layer of function call here to force list context to make this a valid comparison?

    --MidLifeXis

[reply]
[/msg]
[d/l]
[select]
Re: map versus for [id://702124]
by actualize (Beadle) on Aug 04, 2008 at 20:13 EEST
       

    The speed increase involves minimizing resource intensive tasks such as sorting. Instead of of running code every time you iterate over a loop, you perform a map on the data once storing the output into an array. Then you can use the information from the array. The classic example would be the Schwarzian transform .

    -Actualize
[reply]
[/msg]
Re: map versus for [id://702157]
by ikegami (Archbishop) on Aug 04, 2008 at 22:32 EEST
       

    Best practices rarely have anything to do with performance. They are usually designed to avoid pitfalls or to increase readability and maintainability, often at the cost of performance.

    I think your assumption that Damian recommended map for performance reason is flawed, or did he say as much?

[reply]
[/msg]
[d/l]

Back to Seekers of Perl Wisdom


XP Nodelet
You have 19 votes left today.?
Tick tock
Mon Aug 4 17:26:13 2008
Aug 05, 2008 at 00:26 EEST
Chatterbox
  • And 0 more, 1 archived

[tilly]: I don't see an easy way to cause all hashes, blessed or unblessed, to sort their keys in a particular way. :-(
[tilly]: I don't know whether it respects inheritance either. Be nice if it did.
[jdporter]: I wish DDS had the equivalent of Data::Dumper:: Terse. I want "single, non-self- referential values as atoms/terms rather than statements."
[thezip]: tilly, it sure is good to see you comin' round PM again... I thought we had lost for good to the managerial ranks... ;-)
[jdporter]: oh, sweet! Data::Dumper also supports the SortKeys method! I'm going to switch back to that.
jdporter seconds thezip's sentiment.
[tilly]: OK, I peeked at the code. It looks like it does not respect inheritance (you can achieve that with DDS_sortkeys methods) but blessed hashrefs default to sorting like unblessed ones.
[tye]: DDS is all about precise dumping. It doesn't do "simple", I expect.
[tilly]: Thanks for the good wishes, thezip. No, I am in no danger of becoming a manager. I'm still not a programmer though.
[thezip]: Yeah, right :-)

How do I use this? | Other CB clients
Approval Nodelet
node history
 FrontPage
Consider node:
Node Type: perlquestion [id://702071]
Approved by Corion
help ntc
Personal Nodelet

Edit | Add current node
Add to public  /  private pad
Find Nodes
Nodes You Wrote
Super Search
List Nodes By Users
Newest Nodes
Recently Active Threads
Selected Best Nodes
Best Nodes
Worst Nodes
Saints in our Book
Leftovers
AltBlue
log AltBlue out
The St. Larry Wall Shrine
Offering Plate
Awards
Craft
Quests
Editor Requests
Buy PerlMonks Gear
PerlMonks Merchandise
Perl Buzz
Perl.com
Perl 5 Wiki
Perl Jobs
Perl Mongers
Planet Perl
Use Perl
Perl Directory
Perl documentation
CPAN
Random Node
Information
PerlMonks FAQ
What's New at PerlMonks(*)
Guide to the Monastery
Voting/Experience System
Tutorials
Reviews
Library
Perl FAQs
Other Info Sources
Free Nodelet

Please read the PerlMonks FAQ (or, at least, How do I post a question effectively?)

Edit Free Nodelet

Nodelet Nodelet

Top Bottom