(think)

An online novel about the Source, the Force, the real life and everything in between...

Permalinks in the Clojure Style Guide

Recent permalinks to rules were added to the community Ruby and Rails style guides.

I’m happy to report that now you can use permalinks to the rules listed in the community Clojure style guide as well.

Here’s an example. Now you can easily refer to rules in heated style debates with your friends and co-workers. :–)

This is an addition that was way overdue (for which I take all the blame). I’d like to say a big THANKS!!! to rbf who found the time I never did and got the job done (in style).

Permalinks in the Ruby and Rails Style Guides

I’m happy to report that now you can use permalinks to the rules listed in the community Ruby and Rails style guides.

Here’s an example. Now you can easily refer to rules in heated style debates with your friends and co-workers. :–)

This is an addition that was way overdue (for which I take all the blame). I’d like to say a big THANKS!!! to Tod Beardsley who found the time I never did and got the job done (in style).

P.S. Hopefully soon the permalinks will be leveraged by RuboCop.

Find Out Where a Rake Task Is Defined

Have you ever wondered where a particular rake task is defined? Enter rake -W (introduced in rake 0.9):

1
2
3
4
$ rake -W db:schema:load

rake db:schema:load                 /Users/bozhidar/.rbenv/versions/2.1.1/lib/ruby/gems/2.1.0/gems/activerecord-4.1.1/lib/active_record/railties/databases.rake:236:in `block (2 levels) in <top (required)>'
rake db:schema:load_if_ruby         /Users/bozhidar/.rbenv/versions/2.1.1/lib/ruby/gems/2.1.0/gems/activerecord-4.1.1/lib/active_record/railties/databases.rake:240:in `block (2 levels) in <top (required)>'

You can also invoke rake -W without an argument and you’ll get a listing of all available rake tasks and their source locations.

Pretty neat, right?

The Elements of Style in Ruby #13: Length vs Size vs Count

One of the problems newcomers to Ruby experience is that there are often quite a few ways to do same thing. For instance – you can obtain the number of items in Enumerable objects (instances of classes using the Enumerable mixin, which would often be collections like Array, Hash, Set, etc) by either using Enumerable#count or the methods length and its alias size that such classes often provide.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
arr = [1, 2, 3]

arr.length # => 3
arr.size # => 3
arr.count # => 3

h = { a: 1, b: 2 }

h.length # => 2
h.size # => 2
h.count # => 2

str = 'name'
str.length # => 4
str.size # => 4
# str.count won't work as String does not include Enumerable

Which one should you use? Let me help with this choice.

length is a method that’s not part of Enumerable – it’s part of a concrete class (like String or Array) and it’s usually running in O(1) (constant) time. That’s as fast as it gets, which means that using it is probably a good idea.

Whether you should use length or size is mostly a matter of personal preference. Personally I use size for collections (hashes, arrays, etc) and length for strings, since for me objects like hashes and stacks don’t have a length, but a size (defined in terms of the elements they contain). Conversely, it’s perfectly normal to assume that some text has some length. Anyways, in the end you’re invoking the same method, so the semantic distinction is not important.

Enumerable#count, on the other hand, is a totally different beast. It’s usually meant to be used with a block or an argument and will return the number of matches in an Enumerable:

1
2
3
4
arr = [1, 1, 2, 3, 5, 6, 8]

arr.count(&:even?) # => 3
arr.count(1) # => 2

You can, however, invoke it without any arguments and it will return the size of the enumerable on which it was invoked:

1
arr.count # => 7

There’s a performance implication with this, though – to calculate the size of the enumerable the count method will traverse it, which is not particularly fast (especially for huge collections). Some classes (like Array) implement an optimized version of count in terms of length, but many don’t.

The takeaway for you is that you should avoid using the count method if you can use length or size.

A note to Rails developers – ActiveRecord::Relation’s length, size and count methods have a totally different meaning, but that’s irrelevant to our current discussion. (Sean Griffin has written a comment regarding it).

That’s all for now, folks! As usual I’m looking forward to hearing your thoughts here and on Twitter!

A List of Deprecated Stuff in Ruby

As APIs evolve it’s inevitable that portions of them will be deprecated. Generally it’s fairly easy to find out what’s deprecated, but for several reasons that’s not the case in Ruby:

  • Deprecation is done through the use of C functions such as rb_warn & rb_warning (as opposed to some more transparent methods as Java’s @deprecated annotation). To see the deprecation messages from those functions you’ll have to run Ruby with -w. Consider this example code:
1
2
3
string.lines do |line|
  puts line
end
1
2
3
ruby -w test.rb

../test.rb:1: warning: passing a block to String#lines is deprecated
  • Alternative Ruby implementations (like JRuby and Rubinius) generally don’t produce the same deprecation warnings. For instance – JRuby doesn’t produce any warnings for the code listed above. One can say that currently deprecations are an MRI implementation detail (although they shouldn’t be).

  • Deprecations are rarely mentioned in the API docs.

  • There’s no easy way to find out in which version of Ruby something got deprecated as rb_warn is a generic instrumentation for producing all sorts of warnings, as opposed to something created specifically to handle deprecations.

  • Some APIs are deprecated only informally (like Hash#has_key? and Hash#has_value?).

  • Some APIs are deprecated with Kernel#warn (like Digest::Digest).

All of the above makes it fairly hard to compile a precise list of deprecations, but we’ll go only for a rough cut here. Let see what we can do…

Grepping in Ruby 2.1’s code base reveals the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
dir.c
2174:    rb_warning("Dir.exists? is a deprecated name, use Dir.exist? instead");

enumerator.c
355:    rb_warn("Enumerator.new without a block is deprecated; use Object#to_enum");

ext/dbm/dbm.c
338:    rb_warn("DBM#index is deprecated; use DBM#key");

ext/gdbm/gdbm.c
453:    rb_warn("GDBM#index is deprecated; use GDBM#key");

ext/openssl/ossl_cipher.c
217:    rb_warn("arguments for %s#encrypt and %s#decrypt were deprecated; "

ext/sdbm/init.c
331:    rb_warn("SDBM#index is deprecated; use SDBM#key");

ext/stringio/stringio.c
656:    rb_warn("StringIO#bytes is deprecated; use #each_byte instead");
876:    rb_warn("StringIO#chars is deprecated; use #each_char instead");
920:    rb_warn("StringIO#codepoints is deprecated; use #each_codepoint instead");
1124:    rb_warn("StringIO#lines is deprecated; use #each_line instead");

ext/zlib/zlib.c
3892:    rb_warn("Zlib::GzipReader#bytes is deprecated; use #each_byte instead");
4174:    rb_warn("Zlib::GzipReader#lines is deprecated; use #each_line instead");

file.c
1413:    rb_warning("%sexists? is a deprecated name, use %sexist? instead", s, s);

hash.c
529:            rb_warn("ignoring wrong elements is deprecated, remove them explicitly");
934:    rb_warn("Hash#index is deprecated; use Hash#key");
3470:    rb_warn("ENV.index is deprecated; use ENV.key");

io.c
3385:    rb_warn("IO#lines is deprecated; use #each_line instead");
3436:    rb_warn("IO#bytes is deprecated; use #each_byte instead");
3590:    rb_warn("IO#chars is deprecated; use #each_char instead");
3697:    rb_warn("IO#codepoints is deprecated; use #each_codepoint instead");
11196:    rb_warn("ARGF#lines is deprecated; use #each_line instead");
11243:    rb_warn("ARGF#bytes is deprecated; use #each_byte instead");
11282:    rb_warn("ARGF#chars is deprecated; use #each_char instead");
11321:    rb_warn("ARGF#codepoints is deprecated; use #each_codepoint instead");

object.c
991:    rb_warning("untrusted? is deprecated and its behavior is same as tainted?");
1005:    rb_warning("untrust is deprecated and its behavior is same as taint");
1020:    rb_warning("trust is deprecated and its behavior is same as untaint");

proc.c
663:    rb_warn("rb_f_lambda() is deprecated; use rb_block_proc() instead");

string.c
6407:       rb_warning("passing a block to String#lines is deprecated");
6576:       rb_warning("passing a block to String#bytes is deprecated");
6665:       rb_warning("passing a block to String#chars is deprecated");
6769:       rb_warning("passing a block to String#codepoints is deprecated");

vm_method.c
54:    rb_warning("rb_clear_cache() is deprecated.");

Below is a cleaned up list of the output shown above. I’ve removed everything that’s unlikely to be of general interest.

  • Dir.exists? is a deprecated name, use Dir.exist? instead
  • Enumerator.new without a block is deprecated; use Object#to_enum
  • StringIO#bytes is deprecated; use StringIO#each_byte instead
  • StringIO#chars is deprecated; use StringIO#each_char instead
  • StringIO#codepoints is deprecated; use StringIO#each_codepoint instead
  • StringIO#lines is deprecated; use StringIO#each_line instead
  • File.exists? is a deprecated name, use File.exist? instead
  • Hash#index is deprecated; use Hash#key
  • ENV.index is deprecated; use ENV.key
  • IO#lines is deprecated; use IO#each_line instead
  • IO#bytes is deprecated; use IO#each_byte instead
  • IO#chars is deprecated; use IO#each_char instead
  • IO#codepoints is deprecated; use IO#each_codepoint instead
  • ARGF#lines is deprecated; use ARGF#each_line instead
  • ARGF#bytes is deprecated; use ARGF#each_byte instead
  • ARGF#chars is deprecated; use ARGF#each_char instead
  • ARGF#codepoints is deprecated; use ARGF#each_codepoint instead
  • Object#untrusted? is deprecated and its behavior is same as Object#tainted?
  • Object#untrust is deprecated and its behavior is same as Object#taint
  • Object#trust is deprecated and its behavior is same as Object#untaint
  • passing a block to String#lines is deprecated
  • passing a block to String#bytes is deprecated
  • passing a block to String#chars is deprecated
  • passing a block to String#codepoints is deprecated

Unfortunately there’s no way to know in which version of Ruby something got deprecated. Obviously most of the things on the list were deprecated before Ruby 2.1. Ideally in the future we’ll get a better deprecation mechanism that actually keeps track of such data.

Hopefully some of you will find this information useful!

We’re planning to get some deprecation tracking in RuboCop, but due to Ruby’s dynamic nature implementing such a feature reliably in a static code analyzer is an impossible task.