Category Archives: Computing

Building a Python portfolio

Recently I’ve struggled to find the time and energy to write blog posts, for various reasons of which I cannot deny that Pokémon Go might’ve been one. However, there were also some valid reasons too, I realised that while I might do some automation, write some Python tools and generally dabble in programming, none of that is visible really from my LinkedIn profile or on my CV.

Sure, I can say that I have experience with Python 2.7.X and Python 3.4/3.5, but where’s the evidence? I can’t exactly put my code from my work repositories online, because that code belongs quite rightly to the company I work for.

 

Thus if I ever want to make it clear that I can write programs and that I am familiar with Python, I decided that I needed to put a Github repo up with some examples of the kinds of things I am capable of doing in Python. Great idea so far and I quickly spun up a repo and initialised it and… Then what?

 

I started asking around in the Tester’s Slack ( http://www.testers.io/ , if you’re a tester and you’re not on that Slack, you’re really missing out on a wealth of resources and advice) about what people would look for in a portfolio, did I want to build a gigantic project with lots of functionality to show what I could do, or did I want to build lots of smaller tools? Luckily the consensus on Slack was that the smaller tools route was the right way to go. I love writing little tools, so that worked out well for me.

Now I had a plan, a rough idea of what I needed  to do, so I sat down and I started to do it.

 

Or I tried to.

 

Turns out that I had no ideas about what it was that I could write, normally when I write Python it’s to solve an immediate problem. For example, I started my job 6 months ago and I’ve filled a small repo with ten or so small scripts that all solve an issue I’ve been facing in my work. One spins up a suite of VMs for regression checking, one resets variables and enabled dev options in builds, one is a simple https server that swaps out a served file at a command. Essentially very small, self contained scripts that make my day to day life easier.

 

So, once again I turned to the community to get some ideas and maybe someone to help me, after all I am not the best programmer, I would appreciate some input from someone who’s more experienced and nothing says “I can work in a team” like a collaborative project. For this I turned to Robert Page, a fellow tester from the Testers Slack (spotting a pattern yet? ) who’s kindly agreed to be the person I bounce my ideas off and hopefully throw a few pull requests my way.


Caution: Python technical part.
But what will we be actually making? Well, nothing groundbreaking but I liked the idea of making a dead link scanner. Not a new idea, but it seems like a fun thing to write . My idea was at first to just use scrapy and pull all the URLs in a domain, then I’d run those URLs through requests and see if I get a 404 on a requests.get(url).status_code . If I did then I would append the url to a list and display it at the end of the run.
Robert convinced me that that wasn’t enough, and he was right, what about pages that return 200, but display a set “We couldn’t find your page” message? Or when you hit a 403? Basically scanning for simple 404s isn’t enough.

 

But it’s definitely my minimum viable product, so I plan to start with the link scanner taking a url and a recursion depth argument. As I commit and push this, I’ll try and make my comments and commit messages explanatory enough to allow them to support this narrative. Github is at https://github.com/Dominic-Kua/Random-things , please have a look and enjoy watching me learn.

The far side of localisation

I’m sure many of you have to deal with localisation of strings and inputs from users. You probably outsource the actual translation to a translation house and you get back a list of strings you can’t really read, but that’s why you hire a company to make your website say “你好!” instead of “Hello!”. Don’t worry however, I am not going to write a post on the wonders of testing in a language you don’t know, mainly because I’ve never done it. Though the subject is almost certainly immortalised in blog posts already, it’s certainly a worthy subject for it.

My experience lies on the far side of that localisation. You might be set up to receive the input of a string in a non-roman language, but how exactly does the user input that string?
My experience is mostly with Mandarin Chinese, so my examples will largely be in that language as opposed to Japanese, Korean or other types of Chinese. If I forget to mention which IME I am using, you can probably assume it’s Microsoft’s Chinese Simplified PRC.

So first of all, why am I sharing this information? Well, simply put I think that this is something that many people will be unfamiliar with and one that it might help you to understand your non-latin userbase more.

So, what does an IME look like?

IME-basic
Something like this. Here I’ve typed in the phrase “hanzi”,meaning Chinese characters in Chinese. As you can see there’s a range of options, because Chinese is a language of homophones differentiated (sometimes), by tone*. Thus for any arrangement of valid syllables, there can be many valid translations of these into Chinese characters. Thus the user is presented with a drop down menu showing all the choices.

How does this affect you? Perhaps your input options include hints. How do these hints interact with the IME? Is the layout of your page conducive to this kind of input? If you test a desktop application, does it try and steal input from the OS? A good example of this would be a game with a chat channel, would you steal input while the chat channel isn’t focussed? Do you have a way to ensure that you don’t steal input if someone enables or disables the IME inside the game?

You can try this out yourself by installing the Chinese keyboard input pack on Windows, SCIM on Linux or I am sure there’s a version for the Mac, but I’m not familiar with the platform. You’ll probably want to try out a few different languages if you decide to really see what the issues are around IMEs, the so called CJK languages (Chinese, Korean and Japanese).

However for me these languages present a slightly different challenge. Due to the nature of the software I test, we can’t devolve language handling to the OS completely, instead we have to implement some areas of it ourselves. Thus while I can read and write a little Chinese and much less Japanese, I’ve recently had to pick up enough Korean to test the Korean IME. To this end I actually resorted to the Microsoft documentation on the Korean IME* which proved that while I could type hangul, it wasn’t the characters for Hangul that were being shown.

I’m not suggesting that everyone who wants to test using an IME learn the languages of the IME, because that would take far too much time and not everyone enjoys learning languages. However it is possible that a few words of each and, importantly, how to input them on your supported platforms, could be extremely useful.

* https://support.microsoft.com/en-us/kb/130053

Hypothesis testing and why we’re all actually research scientists

Testers are scientists.

You might not wear a lab coat to work (but if you don’t, you should try it), you might not think of yourself as a scientist, but what you do every day is craft a hypothesis and then design and execute experiments to try and falsify it. Many testers might not think of themselves as experimental scientists, it may seem like we have nary a test tube in many an office and no
Bunsen burners in the server room, so how can we be scientists?

Science isn’t defined by the equipment, but by the methodology. Karl Popper’s philosophy of science* defines science, in a horrible paraphrase, as the framing of a hypothesis and repeated attempts to disprove it – which is exactly what you do as a tester. You have a hypothesis, which I’ll designate as the hypothesis under test. The hypothesis under test for software testers is, generally, that the software behaves as expected.

So why does this matter? Well first you need to define what the expected behaviour is, what it should do when presented with different inputs/situations. Then you need to see if you can falsify that hypothesis under test by designing experiments which will provide a series of inputs and situations and observe the outcome. You tend to call these test cases. But are you really doing this? Often it’s too easy to fall into the trap of merely checking that the software behaves properly when given correct input, however that’s not falsification. You definitely should see if this happens, but there has to be an element of falsification; you have to assume that when given abnormal input or situation it also behaves correctly. Of course the label “situation” here covers a multitude of sins, from throttled CPU, to network time outs, to RAM. Your experiment will probably be best served by being broken down into many experiments to isolate each input and environmental variable.

Your experiment is starting to look a lot like a test plan and test cases/areas under test. Now you might be testing something like tens of thousands of connections to a database or packet transfer and the environment itself will introduce occasional failures, or you might simply have a tolerance for a certain level of failure. You accept that as long as X% or less of your experiments/tests fail, you’re willing to accept that the system as a whole works. Now this is a statistical method of hypothesis testing, you might have done something like this at school with confidence intervals and two tailed tests and the like. There’s a lot of maths around this area and if you’re so inclined, and I honestly think it’s worthwhile knowing, you can find lots about it on the internet**. However it’s probably enough to know an overview of the basic maths from the links below and you’ll see how you can account for things like systemic error.

I hope this has illustrated something of what I use to approach my testing, I was trained at university as a physicist, so I spent a lot of time designing ways to test hypotheses and even more time actually testing them. Then I spent even more time than that going through my results to extract meaning from the morass of data I’d generated.

I won’t be plunging this blog into a morass of mathematics and statistics, at least on my regular testing blog posts. Though I have an idea to explain randomness and its application to algorithms to generate efficient computations and good test coverage. This would require a whole series of posts to explain what randomness is, how it can be useful and why you shouldn’t fear things slipping through a procedurally generated test regime which employs stochastic elements in its tests. Whether anyone would want to read it though is another matter, if you feel strongly either way please let me know in the comments.

* https://en.wikipedia.org/wiki/Karl_Popper#Philosophy_of_science
** http://www.ats.ucla.edu/stat/mult_pkg/faq/general/tail_tests.htm
https://en.wikipedia.org/wiki/Confidence_interval

Getting started in Python

Recently I’ve had an enquiry or two about resources for learning Python and applying those to testing. Well luckily for people who want to know, I was not born knowing Python and I’ve amassed quite a collection of resources relating to python and a few ideas of my own that I hope can help.

First up, there’s the basic courses on Python.
Codecademy – £Free
This is a basic, and free, grounding in python and programming that takes you through several projects and quizzes to get you started in programming and python, definitely recommended for the more traditional starter who wants to learn about programming more than instant practical applications. This can easily be done in a couple of work days or a couple of weeks of evening, or a weekend. It’s a very accessible and aside from the final project you can do it all in the browser so it’s perfect for those who want to dabble.

Automate the boring things with Python – £31
Another course for absolute beginners, longer than Codecademy’s offering and with a lot more practical examples, but less structured towards general programming concepts and more towards getting things done. Yes it costs money, but you’ll be productive with it right off the bat.

Google Python lessons – +£Free
How I learnt Python. This assumes that you know at least one other programming language and the basics of programming, it’s teaching you Python and not how to write a program. As such it doesn’t go through a lot of things like basic program flow, algorithm design or the like. It teaches you how Python does things and gives a great basic understanding of the Python standard library. Doesn’t touch much on 3rd party libraries.

EDIT: Hat tip to Sorina for this Computer science 101 from Udemy. As she suggests it’s an excellent companion to the Google course as it fills in the gap in that course.

Next up there’s podcasts:
Talk Python to Me
Talk Python to me is a general python podcast, its level is definitely beyond beginner, so I’d recommend doing some python programming first to see if you like it. It’s very much focussed on Python development but it includes interviews with people doing quite fascinating things with Python. Expect to hear about massive Python suites being developed, python running at speeds you mightn’t have expected from an interpreted language and library development. I recommend the episodes on Requests and Fluent Python especially, once you’ve spent some time writing Python. Both are very revealing and Fluent Python is in my reading list later on.

Python Testing
This is probably the best match to what I’m writing about when I talk about testing with Python. It’s a little drier and more technical than Talk Python To Me, but it’s definitely worth the listen. Lots of interesting tit-bits about testing with Python, the frameworks, the practicalities. The presenter’s somewhat lubrugious delivery might require some adjustment to, but well worth the effort.

Code Newbie
Less Python focussed, this is just a series of interviews with people talking about their experiences learning to program. Personally I didn’t find this podcast to my taste, a little too general and rambling for me, but useful if you want some inspiration from other people’s experiences.

Finally, a reading list:
Learning Python
Great book with oodles of information about Python and probably life itself. Very comprehensive and a very good reference for anyone learning Python. Expect it to sprout post-its like mushrooms after the rain. If I could only have one Python book, it would be this book. It’s not really a book about learning by example, I recommend it in parallel with the courses above and as a reference book.

Fluent Python
Another brilliant O’Reilly book but definitely for the more advanced user. Lots of excellent information about the more advanced and deep aspects of Python. Very strongly recommended, but definitely not a first book

Black Hat Python
In the vein of Automate the Boring Things With Python, this lets you hit the ground running on security testing and some useful network testing scripts.

As previously stated however, the absolute best way to learn is to keep writing programs! Find something you are passionate about and write it!

How the mindset of a gamer helps in testing

My last blog post ended with the comment that I had started out with the intention of writing about how thinking about a problem like a gamer can help you as a tester.

Gamers have goals. It’s basically the defining characteristic of someone playing a game, that they have a goal they want to achieve. It can be to get past a level, beat an opponent, gather materials, beat a certain time, the objective itself doesn’t matter, but the goal oriented mindset does.

When you are testing, what are you trying to do? Much of the time you might be testing a specific fix, or trying to catch any regressions, but is that really the best use of your time? When you goal-set, don’t be too focused on the immediate, and remember one thing above all others. You’re not there to pass tests. Testers are there to make sure things work, not to say “This test passes”. Passing tests is emphatically not your goal, if it is then you’re not testing the product, you’re sucking up to developers and they won’t thank you for it.

So what are reasonable goals? To catch every bug in the software? Not even slightly possible if your codebase is more than a hundred lines of active code, including any library code. To exercise the happy paths? Definitely possible, but try to aim a little higher. To test a feature to as thoroughly as possible in the time allowed? Finally something realistic, but somewhat nebulous. Let’s try and see if we can’t firm it up a little. What are you testing? Because this is the ultimate arbiter of what your goals should be. If you’re testing a simple one line fix, you probably only need to check its immediate impact, if it’s a new feature that reaches deep into the code base, you will need to guard against regressions in more edge cases. A whole new product? You’re going to need a good test plan for that.

Now, how do you organise your test plan? A gamer approaches their goals and they break them down into manageable chunks. You ultimate aim could be to finish the game, but breaking it down, it’s the next level, the right equipment, the hidden side quest. Lots of small goals that together will lead you to the ultimate aim. The same holds true for testing, break it into little chunks and face them head on. Your final boss is your users and they will leap on any weak points you have, so your game is to fix all the holes in your application, coating it in impenetrable armour. You need to work on each area individually before you polish the suit of armour and face that boss.

Breaking it down is easier than you might think, Every piece of software naturally falls into a few categories of pieces, User interface, input/output, data manipulation, dark magic bit. The first three are fairly self explanatory but the dark magic bit is the most important. I’ve called it dark magic because it’s the part of your software that does what no one else does in quite the same way. The dark magic bit (DMB) is your unique selling point and it’s also the part of your code base that you can’t generally use an off the shelf testing solution to approach. It could be a certain algorithm, a method of parallelised processing or some super low level hooks into the kernel or below. No matter what it is, it’s going to be the part of your codebase that needs the most scrutiny. There’re libraries and standard testing tools for things like I/O, Networking and database interfaces and it’s likely that if it’s not your DMB, it’s going to rely on something pretty standard. No one likes to reinvent the wheel for a mundane task when they get to make a triangular wheel which can roll over water for the DMB.
Of course though, your software has to work, even if the DMB is made of purest genius distilled into code, there has to be a way for it to communicate with the outside world and for users to interact with it. So it’s not that you can just ignore the standard parts of your codebase, just that you have access to standard tools and approaches available to you for these tests. It can often be that you spend more time on the actual testing of your standard components than you do on your DMB, don’t be deceived though, your DMB is still where most of your creative effort is spent.

In essence the DMB is the item or person you need to protect to defeat the boss, everything else is testing the entourage and guards. Approach your DMB as you would armouring your most precious resource and you’ll not go far wrong.

Of course if it all gets a bit much then take a break and play a game, it can only help, right?

The appliance of gaming science

Or how playing computer games can help you test.

I’m pretty certain that there’s a large overlap between those who work in IT and those who game, it’s not one hundred percent but the Venn diagram doesn’t have a lot of space in it.So now it’s time to explain how that can possibly help you as a tester.

There’s as many types of video gamer as there are people who play videogames, but they tend to fall into a few major archetypes. There’s the classic power gamer – completist, min-maxer* who will research how to win, learning hundreds of key combinations if that’s what it takes. There’s the roleplayer who focusses more on character and story than on statistics. There’s the casual gamer who plays to kill a few minutes here and there who doesn’t care if they’re not first to do something as long as they have fun. Recently there’s also been the mobile gamer who plays puzzle/social games.

Now of course there’s massive overlap between these archetypes and none of them is going to represent anyone completely, Most people are a mix of all or some of these, but each one can teach us something about how to test, so the more of a crossover you have, the more pools of experience you can bring to bear.

The Power Gamer

This archetype is probably what most people think of when they think gamer, what they might not realise is that the archetype is also the embodiment of the motto, semper paratus**, always prepared. For a tester this means learn your product. What should it do? How should it do it? What should it not do? What external interactions can affect the result? For example, should your messaging app deliver messages within a certain time? Should it always be listening for messages? How does it handle network timeouts? What happens if it gets closed before a message arrives? Is it peer-to-peer or server-client? What are the weakness of each? How do you test them? Is this the most efficient way to test this product? The power gamer looks at all these things before making a move. Planning is key.

The Roleplayer

This is more often associated with tabletop gamers, players of Dungeons and Dragons, or Live Action RolePlayers. This archetype gives us something incredibly valuable to a tester, empathy with the end user. A roleplayer can assume another character and act out their reactions to various events, which is exactly what we need to do as testers. When a new user encounters your product, how do they interact with it? What cues do they have? What does a regular user do most often, is it easy? What’s the least common thing you’d do as a user? Is it OK to bury it in a few layers of menus? The roleplayer in you can help you answer these questions and more.

The Casual Gamer

The casual gamer is the master of making each moment count, Got a spare half an hour? Advance a little way in a game, watch some youtube videos about the game, maybe even read up on what the Power gamers are doing. They bring another useful skill to the table, maximising what you can do in the time available to you. This can mean a bit of exploratory testing when you have a few minutes, but it can also be so much more. Do you have fifteen minutes while that build finishes? Practice a bit of programming and finish your script! Got five minutes before a meeting? Read some of your engineering documents or even a testing blog or two! Casual gamers make the most of the time they have, testers should too.

The Mobile Gamer.
The mobile gamer is related to the casual gamer, but often more heavily focussed on puzzle based gaming. This gives them a keen eye for pattern recognition and the ability to work on problems subconsciously while they focus on other tasks. These are two very powerful tools for everyone, not just a tester, but the ability to spot when something isn’t quite right, before it leads to a problem is the very nature of what it means to be a tester. We’re the stitch in time to save nine. It’s very often the case (at least for me) that the solution to a problem comes to me when I am doing something completely unrelated, learning to allow your mind to work on things in the background is definitely something every tester should work at.

No matter how you game, you’re learning skills you can bring effectively to testing.

N.B.
This post was originally on how the visualisation techniques of gaming could be brought to bear to help in test strategising, but morphed as I was writing it. I do plan to come back to that theme in the future though.

*a min-maxer is someone who tries to get the most power output from the least input, the maximum from the minimum.

** I would also accept estote parati

How do you learn to program?

Following on from my last post, you might’ve decided you want to learn how to program, but now you need to work out how. First off let me make a confession, I am a massive Python fanboy, it’s got the ease of reading, the speed of writing and you can get some deep C hooks and exploit some speed that you might not think you could from an interpreted language. Yes it has its cons, not everyone likes duck typing, significant whitespacing or the fact that if you’re using Python for threads, you’re going to have a bad time (I/O threads excluded). Still despite its drawbacks, I’ve never used any programming language as quick and enjoyable to write.

So I’ll come right out and say it; if there’s a choice, use Python. You might not have a choice, your automation framework might be Java, Ruby, Javascript or even C or C++, without stepping into .Net shops where C# might rule the roost. You’ve got to learn a language that you can use regularly or you’ll learn slowly.

So that, right there, gives us our first method for learning to program, write programs. Sounds simple, but is harder than you think. You can follow along with any number of lessons from the internet that tell you how to write programs, you can do that and sure, you’ll learn the content no doubt. But if you want to learn it faster? You do the very basic lessons and then you start to write programs that matter to you.

Secondly, care about it. This is a very hard one to envisage yourself ever doing, but much easier to actually do. Get into your programs, when you learn something new you should stop for a moment and take pride in it. Every new method you learn to solve a programming problem is another tool in your tool box. When all you have is a hammer, everything looks like a nail. When you’ve got a properly filled toolbox, you can do a lot more.

Thirdly, learn to love failure. A lot’s been written on this subject, but programming always takes place on the very edge of your competence. If you know how a problem should be solved, it’s just rote work to get the job done and you’ll find your attention slipping, mine does. But when you’re faced with something you don’t understand how to do, that’s when it gets interesting. Suddenly you’re forced to learn again, learn more. And what you learn, sticks, because you take pride in your solution to a new challenge. You will fail. You will fail over and over. At first you’ll be failing at the small things, punctuation, capitalisation. Later on you’ll be failing because you’re reading from random memory or not checking your inputs for buffer overflows.

You might enjoy the early failing more. But if you accept it’s going to happen, you’ll enjoy all of it as a learning experience.

When I am in a new job/role, the first thing I do is look for things I can automate. Recently I discovered that I’d be spending a lot of time changing command line arguments in the product I’d be testing and then reinstalling and checking a new build. Fine, except that unless you do a complete clean uninstall, the config persists. Not optimal. It’s hard to remember everything you’ve changed every time, so the very first thing I wrote was a quick python script (less than 100 lines, including comments) which reset all the variables, set some debug flags, installed the latest licence…

I don’t file bugs because of vestigial configuration settings.

Did I learn anything from writing this? I mean I’ve been writing Python for four years now, surely I’d know everything there is to know about… Oh wait, no I learnt about the subprocess library and I used some nice dictionary walking techniques. I’m proud of it too. I am emphatically not a programmer, I write some programs and scripts here and there to help with my everyday life. I’ve had no real formal training in programming and I know maybe three or four algorithms properly (and that due to an edX course in Algorithms). So for me, every time I open Sublime Text and start to bash out a Python script, it’s going to be a learning experience. That’s why I love it.

Do I fail? I fail hard. I fail often. I sit staring at my program wondering why nothing works for ages. Then I walk to get a coffee, or start to go home and it comes to me. Ask anyone who writes programs, those moments of enlightenment are worth all the frustration and puzzlement.

Though there’s a lot more enlightenment and a lot less boilerplate in Python!

Bash on Windows 10

If you’re following the tech press at all, you can’t have missed the news that Microsoft have, in partnership with Ubuntu, released bash on Windows 10.

 

However you might not know exactly what that means, or why you should care. Luckily I am one of the brave souls who play in the Windows 10 insiders Fast Track; which in practical terms means I get to reinstall Windows about once a week, but also means I get to play with next season’s toys today.

 

One of these toys is bash on Windows. This was fantastic news for me, as I’ve just changed jobs to a purely Windows company and was sorely missing the power of a proper bash terminal. But that’s too far into the story. Let’s start with what bash is.

The Bourne Again SHell, or bash, is a terminal emulator for Linux and Unix systems that is also a scripting language for the same emulator. If you’re a Windows person, think cmd or Powershell, but much more powerful and with an absolute plethora of tools to make a developer or tester’s life easier in the command line. Now this is not the first bash shell on Windows, far from it. There’s been Cygwin and GitBash for years and by and large they’ve done a decent job. But this is quite different. Cygwin and other bash shells on Windows have, until now, recompiled the binaries for their tools into Windows exes, this means you can’t just take any off the shelf Linux or Unix tool and just expect it to work, someone has to have worked at it to make it run on Windows. This version from Canonical and Microsoft is something quite different, quite special and just a little bit ironic*. What it relies on is the “Windows Subsystem for Linux” (which to my mind should be the Linux subsystem for Windows, but hey). This subsystem is essentially a compatibility layer, which takes OS calls from Linux binaries and converts them on the fly into their equivalent Windows calls.

So why is this special? It means you can take any old Linux binary for x86-64, and run it, assuming the dependencies are satisfied. Any old binary like, say, apt-get or dpkg. Now if you’re not familiar with Linux, you might not have heard of a package repository. This is like the Windows store, except really good and people use them. Canonical release Ubuntu, and Ubuntu has an absolutely massive set of packages in their repositories. This version of bash has access to those repositories.

This is like adding tens of thousands of tools to Windows, in one go.

 

So, who cares? It’s not like you needed these tools before, why do you need them now?  Well I care, I care a lot.  I care enough to write this blog post. Here’s why. These tools are industry standards, and they do things you might not have believed possible. The core philosophy of Unix is to do one thing and do it well, and these tools do exactly that. There’s things like ls, which is vaguely analogous to dir on Windows. It lists the files in a directory, so far so humdrum. Then there’s mv which moves files, just like Windows’ move. Nothing to get excited about here. Then there’s cat, which displays the contents of a file, just like type on Windows. But then, then we can move on to more interesting utilities, like grep, which searches for a pattern (and it can be a regular expression, a sentence, a string or even just a particular number) and shows you where it found it. There’s sed, which is a Stream EDitor, that looks for patterns and changes what it finds. So far, nothing earth shattering.

But bash has a trick up its sleeves, it’s a simple trick, but it’s an absolute corker. You can take the output from one command and use it as the input for another, and then the output from that command can be used as the input for another. So you go to your photos directory and you have a look around. You type ls and get back a list of all the files in that directory. But you’re only interested in ones you took with Tony the Tiger, you remember naming those Tony_tiger_XX.jpg where XX is a number. So you add the pipe “|” to the end of the ls command and then add another command,

grep Tony_tiger

. This gives you the command

 ls | grep Tony_tiger

and it’ll only list the files that start with Tony_tiger. But then you want to move them to another folder, say a Tony_Tiger folder. So you make the folder,

mkdir Tony_Tiger

(same command as on Windows), then you start to get creative with something called xargs. xargs takes standard input and uses it as a variable in a command. So to move all files with Tony_tiger to the Tony_Tiger directory, you can use this command

ls | grep Tony_tiger | xargs -I{} mv {} Tony_Tiger/

(You can actually do this with the metadata and rename on the fly rather than file names using another utility, but it’s  too involved for this explanation, though infinitely more useful)

So while that example was fairly facile, you can see that with this piping, you can achieve some remarkably powerful effects with a very concise command, very quickly and efficiently. Now I know some purists out there are going to say that Powershell can do the same thing, which it sort of can. But for me Powershell doesn’t have the plethora of tools available, not all of its tools work in stream mode (dir, for example blocks, so it has to finish dir before it’ll release the results to the pipe), also it works in objects not text, which makes things a bit different.

 

So now you know what bash is, why you might want to use it and a little bit about how it works. Now let’s talk about the actual subject of the post, bash on Windows 10.

I’ve been playing with it for a few hours now and I’ve found some of the limitations of this new bash. As it stands there are swathes of the standard Linux file system missing or not mounted. The reason for this is that Windows isn’t Linux, so a lot of the layer is faking things that don’t exist. The file systems are completely different, how they treat things is completely different. Linux sees everything as a text file, sort of, so you plug a usb stick into your machine and suddenly your /dev/ directory has a ttyUSB0 file. Windows doesn’t exactly work like that, it doesn’t have a central store of files representing every piece of hardware attached to the system. Linux stores active process and system information in /proc, Windows doesn’t have that exactly. Linux stores executables (or symlinks to those executables) in /bin or /usr/bin (or indeed in /usr/sbin), Windows leaves the executables in various folders in Program Files along with the data they rely on. The two systems are very different, so not everything works.

I can’t do anything that requires netlink sockets, so I can’t ping, I can’t see my IP, I can’t do a lot of file system based stuff, like locate or df -h. These things simply don’t work, the systems to support them simply aren’t in place.

Programs like elinks (a terminal based web browser) and tmux (a Terminal MUltipleXer) don’t work quite as expected.

There’s no X server, nor even a Mir server. This means there’s no graphical utilities available. (Though I’ve seen someone use cygwin’s x server to run Ubuntu bash utilities already. Which is awesome, but not ideal)

What does work?

One hell of a lot. The mere fact that I can use standard bash tools on the windows file system means that I can suddenly have some handy scripts that will do a lot of heavy lifting for me. There’s cron, so I can schedule these scripts to run at certain times and most importantly, there’s the entirety of the repositories.

wget works, curl works. So while I can’t ping a website, I can still grab its information. Python comes pre-installed in both 2.7 and 3 versions. An SSH server’s installed, which is about the most monumental thing ever. Cygwin’s had one for years but it’s been quite ropey in use for me, and very slow. I can get git and it’s about a zillion times faster than through gitbash or cygwin in my testing. It’s got incidental little tools like curl and wget and awk and sed and… well it’s full of all the things I’ve taken for granted in the last four years where I’ve mainly been using Linux.

Is it perfect? Nope. Is it a replacement for a dedicated Linux machine? Certainly not.

Is it worth it? Yep. It removes one more layer of friction between a Windows machine and Linux machines. It makes it so you can ssh into your raspberry pi in a cmd prompt happily. It means you can use editors that format line endings properly. You can cat and grep files and search folders for strings based on regexes. You can do so much today that you could not do last week and you can do it in a supported way.

Do I recommend it? It depends. If you’re used to Linux utilities, but you work in Windows? Yes, yes, yes a thousand times yes. If you want to learn some Linux commands without the hassle of dual booting or the overhead of a VM, it’s definitely worth a go. If you never venture into the command line anyway, it’s not going to add much to your world, probably best to let it slide by. If you want a full Linux system with every possible utility at your fingertips? Nope, it’s not even close, but I hope it’ll get closer.

The thing to remember is that it is quite definitely a work in progress, nothing is quite finished and there are rough edges everywhere you look. It’s an amazing first step though and I am really looking forward to using it until the end product.

 

*It’s ironic because for decades, Linux has run Windows executables using something called WINE (which stands, recursively, for Wine Is Not an Emulator). WINE is a compatibility layer for windows programs on Linux. It’s the exact mirror of this Windows Subsystem for Linux.