Difference between revisions of "Python vs Ruby"

From Exterior Memory
Jump to: navigation, search
(Objects)
(Opening objects)
Line 217: Line 217:
 
How is this useful?
 
How is this useful?
  
Consider the following Python code snippet:
+
Imagine you are reading some JSON data in Python:
  
 
  try:
 
  try:
Line 226: Line 226:
 
  except (urllib.URLError, ValueError, TypeError) as e:
 
  except (urllib.URLError, ValueError, TypeError) as e:
 
     raise
 
     raise
 +
 +
Now you can start processing the data:
 +
 
  country = address["country"].upper()
 
  country = address["country"].upper()
  
Even while we dealt with a failing URL fetch or bad JSON syntax, the last line can fail because "country" is not a valid key, or because address["country"] is not a String (but e.g. None), not to mention it might be an empty string.
+
Even while we dealt with a failing URL fetch or bad JSON syntax, this last line can fail because "country" is not a valid key, or because address["country"] is not a String (but e.g. None), not to mention it might be an empty string.
  
 
  try:
 
  try:

Revision as of 16:44, 22 March 2012

A lot has been written comparing the Python versus Ruby languages. For no particular reason I'm just adding a few more lines about it.

The Commonalities

Python and Ruby have much in common. Both are modern languages with all the modern requirements:

  • Namespaces
  • Garbage collection
  • Object Oriented
  • Exception Handling
  • Proper Unicode support (Ruby 1.9 and up; Python 3 and up)

What's more, they both

  • are Interpreted (scripting) languages
  • are Dynamically typed
  • have a large user base
  • support a wide range of platforms (Linux, BSD, Windows, Mac). In fact, it comes standard on most of these platforms.

But talking about commonalities is boring. Let's talk about the differences in syntax, language concepts and available codebase.

Compared to Other Languages

It should be understood that neither Python nor Ruby is an "enterprise" language (like Java), in the sense that it has strict typing or makes sure that all exceptions are caught. Instead, both Python and Ruby are geared towards rapid prototyping.

Many Python and Ruby programmers despise Java for its verbosity. While I do like Python and Ruby, I think the comparison is unfair.

C and C++
are great if you like a fast program, and are prepared to cope with headaches about OS (un)interoperability, and a lot of boring low-level stuff.
Python or Ruby
are great if you like rapid prototyping at the expense of speed and robustness.
Java
is great if you like a robust and portable code, at the expensive of spending a lot of time writing error handling code. Great if you know the specs beforehand, and know that these specs are not going to change. (Good luck with this last one).
Objective C and C#
are great if you enjoy lock-in by a single large vendor.

Please don't claim that Python or Ruby programs are generally bug-free. The whole point of dynamically typed language is that you operated on unknown objects, and basically hope it works fine. That's great because you can't very easily change your data structures without rewriting code. Hence the rapid prototyping. But it's not guaranteed to be bug free.

The Syntax

Variable Names

Ruby (as the name implies) is strongly influenced by Perl. Unfortunately, this means that it also inherited the incomprehensible global variables.

Compare:

Ruby Python
$<
$>
sys.stdin
sys.stdout

While a programmer writing code has to spent roughly the same effort to memorise $< and $> or sys.stdin and sys.stderr, someone who is reading the code has a much easier time understanding Python.

Class Methods

Class methods are more elegantly written in Python than in Ruby.

Python uses decorators for class methods and static methods (class methods can access the class variables, static methods can not)

class Date():
   @classmethod
   def fromtimestamp(cls,timestamp):
      """Return a date by it's POSIX timestamp"""
      ...
   @staticmethod
   def fromdate(year, month, day):
      """Return a date by it's Gregorian year, month and day"""
      ...

Ruby uses the self. keyword to refer to the class object, and place a function in the class rather than in the instance.

class Date
   def self.fromTimestamp
      # Return a date by it's POSIX timestamp
      ...
   end
   def self.fromTimestamp
      # Return a date by it's Gregorian part
      ...
   end
end

Equivalent alternative #1

class Date
   ...
end
def Date.fromTimestamp
   # Return a date by it's POSIX timestamp
   ...
end
def Date.fromTimestamp
   # Return a date by it's Gregorian part
   ...
end

Equivalent alternative #2

class Date
   class << self
      def fromTimestamp
         # Return a date by it's POSIX timestamp
         ...
      end
      def fromTimestamp
         # Return a date by it's Gregorian part
         ...
      end
   end
end

Ambiguity

Ruby syntax can be ambiguous:

functionA 1, functionB 2, 3

May either mean either

functionA(1, functionB(2), 3)
functionA(1, functionB(2,3))

The good news is that Ruby allows you to add as many parenthesis as would be required to make even a Lisp-addict happy.

Block Syntax

Any self-respecting article discussing Python's syntax can't get around it: indentation as a way to signify nested blocks.

While it has been much debated, I personally find it sheer brilliant. Nearly all attempts at a decent block style syntaxes struggle to somehow combine the condition and nesting start syntax on a single line. The only ones who succeeded here are Python, Ruby and [PHP's old style if/endif control structure syntax]] (that no-one seems to use anymore).

Ruby:

if true
   return :yeah
else
   return :nay
end

Python:

if True:
   return "Yeah"
else:
   return "Nay"

Both syntaxes are very readable, and avoid the dreaded if condition { block } syntax.

Ruby define a second closure (next to the normal methods and lambdas), which is a block (or Proc). While conceptually cool, the syntax is not so great. Ruby allows both:

sum = 0
array.each do |item|
   sum += item
end
sum = 0
array.each { |item|
   sum += item
}

Regardless if you Pascal (begin/end) or C ({/}) is your big inspiration, none of these syntaxes is very inspired, and allowing two different syntaxes will only give rise to heated debates between the two style fanatics.

Not that Python is debate free. Care for some tab-versus-spaces discussion, anyone?

Conclusion

The better readability, and the other features make me conclude that Python's syntax is superior over Ruby

From The Zen of Python:

Readability counts.

The Language

One Way

In the syntax section above, I argued that code should not only be easy to write, but also easy to read. Indeed The Zen of Python not only says Readability counts, but also:

There should be one-- and preferably only one --obvious way to do it.

This does not apply to Ruby. For example, the map and collect methods are just aliases, just like the find_all and select methods.

While Python's intentions are clear, in practice it does not always hold. Imagine you are reading a sequence of bytes from a file, Python gives you the primitives (immutable) bytes and (mutable) bytearray, not to mention a (slower) list. Also, you can either use struct.unpack or array.array to do the conversion. Not really a "one obvious way" anymore.

Objects

In Ruby, everything is an object. Every Ruby programmer has marvelled at the fact that even numbers are objects:

5.nonzero?

Python on the other hand, is often criticised by defining global functions such as len() instead of adding it to as a method to every object (behind the scenes, the len(x) function does call a method, x.__len__()).

To me, the fact that len is a function instead of a method is irrelevant. The fact that None is a useful object in Ruby did appeal to me.

For example, it is possible to write:

nil.to_i

and Ruby returns a 0.

The Python equivalent, int(None) raises a TypeError.

Python's None is also an object, but has only about 20 methods (the same amount as an object), while a Ruby nil object has 55 methods.

Opening objects

The real power of Ruby comes from the addition of methods to modify existing objects in-place.

Consider:

class ::Integer
   def even?
      return (self % 2) == 0
   end
end
2.even?
=> true

How is this useful?

Imagine you are reading some JSON data in Python:

try:
   page = urllib.request.urlopen("http://www.example.com/json/address?id=2")
   address = json.loads(page.readall().decode('utf-8'))
   if not isinstance(address, dict):
       raise TypeError("Invalid JSON at URL")
except (urllib.URLError, ValueError, TypeError) as e:
   raise

Now you can start processing the data:

country = address["country"].upper()

Even while we dealt with a failing URL fetch or bad JSON syntax, this last line can fail because "country" is not a valid key, or because address["country"] is not a String (but e.g. None), not to mention it might be an empty string.

try:
   country = address["country"].upper()
except (TypeError, AttributeError, LookupError):
   country = 

If you want to set many variables, this quickly becomes a lot of repeated code. Python's equivalent of the ?: tertiary operator can turn this into a one-liner:

country = address["country"].upper() if ("country" in address and address["country"] != None) else 

This is shorter, but not very efficient and still not very readable.

In Ruby, it is is possible to open an existing (even built-in) class, and add a method to it:

class ::Hash
   def value_or_empty key
      if self.has_key? key
         self[key]
      else
         "no"
      end
   end
end

And the code simply reduces to:

country = (address.value_or_empty "country").upcase

This particular example can even be written without resorting to modifying the Hash class:

country = address["country"].to_s.upcase

This is possible because Ruby return nil for an undefined key in a Hash, and because nil.to_s returns an empty string.

This is not easily possible in Python. In Python, you would need to write a decorator class to accomplish the same:

class defaultvaluedict(collections.UserDict):
   """Dict decorator that returns an empty string for unknown keys"""
   defaultvalue = 
   # UserDict stores the content of the dict in self.data
   def __getitem__(self, key):
      try:
          return self.data[key]
      except LookupError:
          return self.defaultvalue

address = {"country": "nl", "name": "me"}
address = defaultvaluedict(address)
country = address["country"].upper()

While I admire the Ruby flexibility, I do fear that changing objects in-place may yield name collisions. What if I add a method which name is also used for a popular framework that also uses the same name, like the Rails framework? It is likely that my code will break.

Protected Methods

Python provides some syntactic sugar to support class-private names, but for all intends and purposes Python has no protected or private methods or variables. Ruby (and most other programming language) normally support protected methods, as well as private, read-only and read/write variables.

In fact, the two solutions provided by Python that I know of are hardly used:

class MyClass(object) :
   def __init__(self) :
      self.__var = 

a = MyClass()

While a.__var raises an AttributeError, the variable is not really protected, but only mangled. It can still be accessed (and overwritten) using a._MyClass__var. This kind of syntactic sugar reminds me of security through obscurity.

Named Parameters

Ruby has no named parameters (only order), Python has them (will be introduced in 2.0)

Blocks

nested

http://www.robertsosinski.com/2008/12/21/understanding-ruby-blocks-procs-and-lambdas/

Symbols

labels, named constants Ruby: I love the Symbols (e.g. :label)


Tuples

Python has tuples, Ruby not (only arrays)


Yield

Python yield syntax easier to understand;

Ruby yield much more powerful

  1. !/usr/bin/env ruby

def g

 i = 0
 while true
   yield(i)
   i += 1
 end

end

g do |n|

 puts n
 if n >= 10
   break
 end

end

Symbol Table

Confusing: there is more than one symbol table.

Eg.: irb(main):056:0> p "hello" "hello" => "hello" irb(main):057:0> p = 1 => 1 irb(main):058:0> p p 1 => 1

In here, "p" has two meanings


Return Values

//Invalid in most programming languages. Two of the exceptions are Ruby and Scala
my_variable = if (condition) { "yes" } else { "no" };

?: ternary operator

my_variable = (condition) ? "yes" : "no";

As a side-note, it is perfectly possible

 C:          x = (condition ? "yes" : "no")
 Python 2.4: x = (condition and "yes" or "no")
 Python 2.5: x = ("yes" if condition else "no")

Type Checking

Code Validation

Python Zen

Errors should never pass silently. In the face of ambiguity, refuse the temptation to guess.

strict typing, exceptions can pass

Code is not validated; code is only checked while it is run. An error in a less-often used branch

Threading

Python multitasking support is a joke, despite a thread and a threading modules in the standard library. The reason is a global interpreter lock (GIL), which effectively halt all but one thread to ensure consistent state among all threads.

Ruby has a global VM lock, which is roughly the same as the Python GIL.

The way to use concurrency in Python is to use the subprocess module, even though it has some limitations (mostly caused by the pickle format to communicate between processes)

Nothing is True

Ruby 0 is True.

Unicode

Python 2 3 3.3 UCS-4 default

Conclusion

Libraries

Documentation

Python has easy accessible docstrings, Ruby not

Standard Library

standard library. Python much more mature. Ruby: String.shellescape (1.9.3), very poor logging facility (need log4r)

Package Manger

Integration with Other Languages

Java integration Objective C Dot net (C#)


Conclusion

Further Reading

sites: http://yehudakatz.com/2010/05/05/ruby-1-9-encodings-a-primer-and-the-solution-for-rails/ http://www.ruby-lang.org/en/documentation/ruby-from-other-languages/to-ruby-from-python/

The Verdict