Archive

Archive for the ‘Essential PowerShell’ Category

Effective PowerShell

September 9, 2007 Leave a comment

Keith Hill has started an “Effective PowerShell” series on his blog that allows you to learn from his experiences working with PowerShell for the past couple of years.  He has published 6 articles in this series so far, including:

  1. Effective PowerShell Item 1: The Four Cmdlets That are the Keys to Finding Your Way Around PowerShell
  2. Effective PowerShell Item 2: Use the Objects Luke. Use the Objects!
  3. Effective PowerShell Item 3: Know Your Output Formatters
  4. Effective PowerShell Item 4: Commenting Out Lines in a Script File
  5. Effective PowerShell Item 5: Use Set-PSDebug -Strict In Your Scripts – Religiously
  6. Effective PowerShell Item 6: Know What Objects Are Flowing Down the Pipe

These articles are very similar to the articles in my Essential PowerShell series.  There is a lot of useful information in these articles.  I encourage you to take the time and read them.

Kirk out.

Technorati Tags: , , ,

Essential PowerShell: Avoid shorthand in shared PowerShell scripts

September 6, 2007 2 comments

While this topic may be the subject of debate and it has certainly been discussed to some extent before, I am a firm believer that PowerShell script authors should completely avoid shorthand in their shared PowerShell scripts.  Shorthand should only be used when using PowerShell interactively from the console.  I consider this a best practice when working with PowerShell.  There are four reasons for this that all boil down to providing a better user experience: readability, integrated help support, portability and upgradability.  I’ll explain.

Readability

PowerShell scripts written without any shorthand are as close to self-documenting as you can get.  If the cmdlets and parameters (and variables!) being used are intelligently named, you should be able to read a script containing cmdlets and/or parameters (and variables!) that you haven’t used before and have a general idea what the script is going to do.  PowerShell’s verb-noun format for cmdlets goes a long way to facilitate that.  Having readable scripts is also necessary for maintenance reasons so that scripts maintained by multiple authors are understandable to all authors.

Integrated Help Support 

PowerShell is still new and will likely be new for a while yet.  Many people new to PowerShell will use sample scripts to learn the language, and it will be much easier for them to learn from sample scripts if they use full cmdlet and parameter names.  That way they can take full advantage of the rich help system that is integrated within PowerShell by using it to learn more about PowerShell scripts they find on the web or in products that expose integrated PowerShell scripts such as PowerGUI.  To illustrate this point, let’s look at a sample script where using shorthand can harm the user experience of other PowerShell users.

Here’s a sample script using shorthand that will retrieve the expanded parameter sets for the get-command cmdlet:

gcm -name get-command -type cmdlet | select -expand parametersets

And here’s that same sample script without any shorthand:

Get-Command -name Get-Command -commandType Cmdlet | Select-Object -expandProperty ParameterSets

If someone new to PowerShell is trying to figure out what the different parts of this script do, they can use the get-help cmdlet or help alias on the parts of the script.  This can be especially important if English isn’t their first language (the integrated help documentation in PowerShell has already been localized in many different languages to reduce the learning curve for people whose first language is not English).  For the sample script, they might try to look up help for the gcm command, the name parameter, the type parameter, the select command, or the expand parameter.  Here are the PowerShell commands to do just that:

  1. Get-Help gcm
  2. Get-Help gcm -parameter name
  3. Get-Help gcm -parameter type
  4. Get-Help select
  5. Get-Help select -parameter expand

Of these 5 commands, 2 will fail.

The third command fails because the Get-Command cmdlet does not have a Type parameter.  Type is the alias for the CommandType parameter, and PowerShell 1.0 does not resolve alias names when passed in as the value of the Parameter parameter.

The fifth command fails because the Select-Object cmdlet does not have an expand parameter.  But expand isn’t an alias for a parameter either.  In PowerShell 1.0, part of the parameter name resolution logic includes support for identifying parameters by the shortest substring that uniquely identifies the parameter or an alias to the parameter when compared with a list of parameters and aliases for the cmdlet.  In this case, expand is a substring of expandProperty and there are no other parameters beginning with “expand”, so PowerShell deduces that the script author is referring to expandProperty and lets the script run accordingly without warnings or errors.

Had the script been written without any shorthand, as in the second sample, then all attempts to look up the same help information would succeed.  Here are the same Get-Help commands but without any shorthand:

  1. Get-Help Get-Command
  2. Get-Help Get-Command -parameter name
  3. Get-Help Get-Command -parameter commandType
  4. Get-Help Select-Object
  5. Get-Help Select-Object -parameter expandProperty

 All 5 of these commands work as expected.

Portability

As Jeffrey Snover indicated in his blog post titled “Is it safe to use ALIASES in scripts?“, aliases are not constant and can be removed.  While this would likely only happen rarely in practice, PowerShell scripts using aliases are not guaranteed to be portable to other environments and should be avoided.

Upgradability

As I mentioned earlier in this post, part of the parameter name resolution logic in PowerShell 1.0 includes support for identifying parameters by the shortest substring that uniquely identifies the parameter or an alias to the parameter when compared with a list of parameters and aliases for the cmdlet.  This means we could have written the above sample script like this:

 gcm -na get-command -ty cmdlet | select -exp parametersets

In this case, “na” is the shortest substring that uniquely identifies the name parameter, “ty” is the shortest substring that uniquely identifies the type alias for the commandType parameter, and “exp” is the shortest substring that uniquely identifies the expandProperty parameter.  While this works fine just now, you cannot depend on this sample continuing to work in future releases of PowerShell.  Why?  Because there is no guarantee that another parameter or alias will not be added in a future release that would make one of these substrings ambiguous.  In fact, you cannot depend on this sample working in the current release of PowerShell for users who have added parameter aliases that would make one of these substrings ambiguous either.

For these four reasons, PowerShell authors should be diligent about avoiding use of shorthand in shared PowerShell scripts.  While using aliases for cmdlets, parameters and functions and shorthand to identify parameters is very useful when using PowerShell interactively, it can negatively impact the user experience of others when used in PowerShell scripts and therefore should be avoided (with few exceptions, if any).

Kirk out.

Technorati Tags: , , ,

Essential PowerShell: Understanding foreach (Addendum)

August 31, 2007 6 comments

I need to add an important addendum to the Essential PowerShell: Understanding foreach article I posted (if you haven’t read it already, you might find it worthwhile to start with that article first before reading this article).

As Dmitry Sotnikov discovered first-hand and blogged about earlier today, there is another important difference between what might appear as two similar constructs in PowerShell: the foreach statement and the foreach alias (which is an alias to the ForEach-Object cmdlet).  The foreach statement is a loop construct.  The foreach alias (and therefore the ForEach-Object cmdlet) is not a loop construct.  It’s a cmdlet.  Why is this important?  Because you can control the logic within a loop construct using the break and continue statements, but not within a cmdlet.

Let’s examine this in more detail using a new example.  This time I’ll write a PowerShell script that will take in a string and output each character of that string that is not a vowel.  Note that this is just an example and it is not necessarily the best way to do this sort of thing.  I’m just using it to get the point across.

Here is the foreach statement example:

foreach ($character in [char[]]”Poshoholic”) { if (@(‘a’,’e’,’i’,’o’,’u’) -contains $character ) { continue } $character }

And now what appears to be the equivalent foreach alias example:

[char[]]”Poshoholic” | foreach { if (@(‘a’,’e’,’i’,’o’,’u’) -contains $_ ) { continue } $_ }

If you run each of these two examples, you’ll quickly see that they don’t function the same way at all.

In the foreach statement example script you get what you would expect: each consonant in the string “Poshoholic” is output to the host, so you’ll see P, s, h, h, l and c, each on a separate line.  In the foreach alias example script, though, only the consonant characters that were before the first vowel in the string are output to the host.  In this case, all that is output is P.  Why?  Because the continue statement (and the break statement) only affect the logic of the program loop in which they are contained.  Since the second example doesn’t actually contain a program loop (remember that the foreach alias and ForEach-Object cmdlet are not a loop construct), the continue statement will apply to the entire script in which it is contained.

You could correct this by removing the continue statement and changing the script as follows:

[char[]]”Poshoholic” | foreach { if ((@(‘a’,’e’,’i’,’o’,’u’) -contains $_) -eq $false ) {$_} }

This can be confusing, so if you have any questions feel free to post them in the comments and I’ll respond as quickly as I can.  And with a little luck, Microsoft will use this information to clarify their documentation in the about_foreach help topic in a future release of PowerShell.

Kirk out.

Technorati Tags: , ,

Essential PowerShell: Understanding foreach

August 21, 2007 9 comments

Of all of the statements and commands available in PowerShell, there is one in particular that I found causes more confusion than others for newcomers to the language — foreach.  In PowerShell, foreach is both a statement and an alias to the ForEach-Object cmdlet.  This means that you can use it as a statement like this:

foreach ($command in Get-Command -CommandType All) { $command } 

 or as an alias like this:

Get-Command -CommandType All | foreach { $_ }

While you might think that both of these examples do exactly the same thing, they do not.  Both examples will iterate through a collection of objects and execute the internal script block once for each object.  In this case, both examples are simply outputting the objects in the internal script.  Their output will be the same, but how they go about getting that output is different.  It is important to understand these differences and the implications that they have on performance and memory when writing scripts using PowerShell.  Let’s talk about the memory implications first.

The foreach statement does not use pipelining.  Instead, the right-hand side of the in operator is evaluated to completion before anything else is done.  For our example above, the Get-Command cmdlet is called and the results are completely loaded into memory before the interior script block is executed.  This means you have to have enough memory to store all of the objects when you run the script.  This usually isn’t a problem but as my friend Dmitry Sotnikov points out on his blog, in some cases it can definitely be an issue.

In contrast, the foreach alias, or ForEach-Object cmdlet, does use pipelining.  When the second example is used, Get-Command is called and it starts to return the commands one at a time.  As each object is returned out of the Get-Command cmdlet, it is sent into the pipeline and execution continues in the next section of the pipeline.  In this case, the foreach alias gets executed and the object is run through the process script block of ForEach-Object.  Once the process script block completes, the object is discarded and the next object is returned from Get-Command.  Since only one object is passing through the pipeline at a time, memory usage is minimal.

This would seem to indicate that script authors should always prefer the foreach alias, or ForEach-Object cmdlet, over the foreach statement, but according to Bruce Payette, author of PowerShell in Action and development lead for PowerShell, foreach can perform faster than ForEach-Object in some cases.  He states, “in the bulk-read case, however, there are some optimizations that the foreach statement does that allow it to perform significantly faster than the ForEach-Object cmdlet”.  If that’s the case, how does a script author decide which is the right command for the job?  How will those optimizations influence a decision to choose the foreach statement over the foreach alias?  How much faster is significantly faster?  Let’s take a closer look at the performance and what considerations need to be made.

I ran a test on my local machine to compare the performance of the two examples I listed above.  For this test I used the Get-Date cmdlet to retrieve the date before the example script started and after the example script completed and then I took the difference of these dates to determine how much time had elapsed during the script.  I also ran this test through 10 iterations for each example and I discarded the highest and lowest elapsed times.  I then took the averages of the remaining 8 iterations and compared them.  The results confirmed what Bruce Payette said.  The average runtime for the foreach statement example was 13.9 seconds and the average runtime for the foreach alias example was 15.9 seconds.  This shows how the internal optimizations in the foreach statement improve performance when compared to the foreach alias.

So, it seems pretty simple.  Use the foreach statement when you either already have the array of objects that you want to process or when your collection of objects will be small enough that it can be loaded into memory all at once, right?  Well that depends on what aspect of performance is most important to you.

One of the many beautiful things about PowerShell is the support for the pipeline and how objects are passed through (and out of) the pipeline one at a time.  If you’re working with an application that is displaying the data objects that are output through the pipeline of a script, such as PowerGUI, you may be more concerned with the performance rate at which those data objects are output through the pipeline so that you can display them more quickly then the overall amount of time required to output all objects.  Whatever portion of the 13.9 seconds were used to load the objects into the collection may seem like an eternity to wait until the first object is displayed on the screen when you can see thousands of objects displayed iteratively over 15.9 seconds.  Perspective is everything when you’re talking about performance.

Hopefully this will lift some of the confusion that you might otherwise face when using foreach in your scripts!

Kirk out.

P.S. This is the first of two articles discussing foreach in PowerShell.  After reading this article I recommend you read the second part as well, entitled “Essential PowerShell: Understanding foreach (addendum)“.

Technorati Tags: , , ,

Follow

Get every new post delivered to your Inbox.

Join 53 other followers