Hacker News new | past | comments | ask | show | jobs | submit login
Streem – a new programming language from Matz (github.com/matz)
547 points by tree_of_item on Dec 11, 2014 | hide | past | favorite | 189 comments



I am glad that someone is working on a new stream processing language, it is a very interesting paradigm. However I hope that they provide some very robust tools for controlling input splitting. As I have spent too much time fighting with awk and wishing it was more flexible(it is frustrating always having exactly two levels of splitting with only matchers on the first and only inverse splitting on the second).

As is you would have to put in filters to resplit input into lines and that is very messy for something that you will need/want to do very often.

For example if you wanted to parse by character it would be wonderful to be able to do the following:

  STDIN | /./{|c|
    # stuff
  }
Even better would be if you took it a step further and offered something like regex pattern matching for the block input. e.g.

  STDIN | /\w+/{|word|
    /house/ {
      # when word is house
    }
    /car/ {
      # when word is car
    }
    {
      # default case
    }
  }


Awesome ! Let's call it sed !

Reference: https://www.gnu.org/software/sed/manual/sed.html#Examples

Sorry, I couldn't miss that one. :) Really, check the example. Sed is most powerful tool and can do astonishing work.


That manual is possibly the single most useful technical document that I have ever read. It has enabled me to write extremely powerful programs that no one I know can understand.

It would however be nice to have a tool with similar power but simpler, more comprehensible syntax. One of the other commenters linked to an interesting document on sam, which has a better control flow but equally arcane syntax.


I think I could argue that Perl is that tool. I'm not looking forward to people chiming in about the syntax that have little to no experience with it though.


I fell in love with Perl 3 over twenty years ago and loved how it took the best parts of both awk and sed and added extra features while making the overall syntax more consistent. Ironically, the consistency of Perl's syntax in those days (vs having to remember awk, sed, and other somewhat overlapping tools, each with its own awkward syntax) made it an amazingly convenient super-tool for stream processing. Perl 4 came along almost immediately, and I couldn't believe how powerful those little "Perl one-liners" could be.

With Perl 5, Perl essentially repositioned itself as a full-blown, general-purpose programming language, with all the power that entailed, but still stuck with a syntax based on essentially trying to be a more powerful and consistent superset of sed and awk.

These days, I think we could do a lot better with a stream processor with the stream-processing focus and one-liner power of Perl 4 but without the needless (these days) constraint of retaining the syntactic hash of ancient unix utilities.


Perl indeed. I know there are a lot of modern compilers out there, but speaking about parsing files I think perl suits me the most in terms of code readability/speed balance.


> It has enabled me to write extremely powerful programs that no one I know can understand.

I'd argue that makes it one of the least useful documents you've ever read :)


> It has enabled me to write extremely powerful programs that no one I know can understand.

A job for life! :-)


I think that stream-oriented languages are doomed to have an arcanic syntax.

Streams are a non-trivial construction, after all.


I think I don't believe you. Look at the car and house example above. Is that arcane?


Well, sorry but sed _is_ cryptic. Let me quote an exemple for you (squeezing blank lines):

>leaves a blank line at the beginning and end if there are some already.

     #!/usr/bin/sed -f

     # on empty lines, join with next 
     # Note there is a star in the regexp 
     :x 
     /^\n*$/ { 
       N
       bx
     }
     
     # now, squeeze all '\n', this can be also done by:
     # s/^\(\n\)*/\1/
     s/\n*/\
     /
As soon as you begin to use sed registers, the code becomes arcanic.


> arcanic

Arcane. Some would call it archaic too :)


> I think that stream-oriented languages are doomed to have an arcanic syntax.

> Streams are a non-trivial construction, after all.

...

> Well, sorry but sed _is_ cryptic. Let me quote an exemple for you (squeezing blank lines):

I agree with you that sed is cryptic, but I don't think that that necessarily means that stream processing languages are doomed to have an archaic syntax. I'd also agree with marvy in saying that the car and house example is very clear as to it's intent:

  STDIN | /\w+/{|word|
      /house/ {
        # when word is house
      }
      /car/ {
        # when word is car
      }
      {
        # default case
      }
    }
I'd say that's very straightforward and not arcane at all. Just because sed is arcane (it could be argued) doesn't make it a precondition for any stream language

EDITS: formatting.


Everything you can do in sed you can do in perl, there's been sed-to-perl converters shipped with perl since the ur-times.


Yeah yeah, of course. The hacked syntax showed by GP made me really think of sed. :)


This is how you would write it in factor:

    "~/yourfile.txt" utf8 file-contents 
    "\\w+" findall [ first second ] map [ 
        { 
            { "house" [ "house stuff" print ] } 
            { "car" [ "car" print ] } 
            [ "default stuff: %u\n" printf ] 
        } case 
    ] each
Tacit programming basically is the same thing as stream programming.


> However I hope that they provide some very robust tools for controlling input splitting

Yes, splitting and combining. Looking at the code example, I feel like so much opportunity is just being squandered, where something like Rust's mapping constructs would feel so much better. Yours is definitely more in that vein, here's something a little closer:

  STDIN | /\w+/{|word|
    /house/   => "foo"
    /car/     => "bar"
    "literal" => "baz"
    _         => "dib"
    # (where _ is a magical symbol for the default case... could be anything)
  }


Streaming, pattern matching, and regex combined in one sounds pretty damn cool.

Might also be nice to borrow some of Scala's features for terser code than Ruby, like

    ary | { _ + 5 }
instead of

    ary | { |el| el + 5 }


I prefer

    {|el| el + 5}
more than

    { _ + 5 }
the latter is too implicit for my taste, also I imagine it could become more confusing in a more complex context.


Personally I like it and find it clear.

Compare:

    ary.map(_ * 2)

    ary.map(x => x * 2)
"_" is perhaps an ugly choice of character (frankly I'm not sure why Scala is so obsessed with it, since it's used for so many featurse), but I think the semantics are sensible.

Of course, in a more functional language you could perhaps just write

    ary.map(* 2)


> I'm not sure why Scala is so obsessed with it, since it's used for so many featurse

You might check out http://stackoverflow.com/a/8001065/3614122. The interesting thing (to me) is that the majority of those are actually the same feature (lifting functions). It's just such a powerful/flexible one that it's often misinterpreted as an entirely new thing in different contexts.

  ary.map(_ * 2)
Is no different than:

  val f = (_:Int) * 2

  ary.map(f)
Or:

  val f: Int => Int = _ * 2

  ary.map(f)
You're just using type-inference in conjunction with lifting. I wrote a blog post on it that's maybe not complete garbage. ;-) http://www.ssmoot.me/scala-s-magical-placeholder

Some of the other examples of the underscore are pattern-matches, which can be used on the LHS of assignment, similar to Erlang. Which comes in handy writing unit-tests. i.e.: If I expect something to return Some[User], and I'm writing a GET->UPDATE integration test, then I'd probably do something like:

  "Update should not blow up" in async {
    val Some(user) = await(db.get(userId))
    val update = user.copy(name = "bob")
    val Success(result) = await(db.put(update))
  }
Extractors/Pattern-Matching is way way up there for me. Much more significant than for-comprehensions. Though you could write the same code like:

  "Update should not blow up with a for-comprehension" in {
  
    val test = for {
      Some(user) <- db.get(userId)
      update = user.copy(name = "bob")
      Success(result) <- db.put(update)
    } yield result

    whenReady(result) { "whatever" }
  }


Your last example code is missing a closing-brace. I know no Scala, though, so I'm actually not sure where it should go.


Thanks. Missed it in the for-comprehension just before the yield.


What about a middle ground on swift's syntax?

    Arr.map { $1 * 2 }


That was going to be my suggestion too. With the added benefit that is supports multiple arguments.


Yeah, that'd be pretty cool.


Probably because underscore is often used by convention in forms and other typography to indicate "something goes here which is to be supplied." In that respect it's no different that Perl's $_ (actually it's the Exact same thing, as the $ in Perl just denotes a variable, so they are both the variable _), or Perl 6's "whatever" which is written as "*".


This is not a confusing context, though. This would be a confusing context:

    ary.map(f(_ * 2))

    ==> ary.map(x => f(x * 2)) ?
    ==> ary.map(f(x => x * 2)) ?
How does Scala interpret that one? I have no idea.


It doesn't. It's invalid code in most (all?) cases.

  scala> val f: Int => Int = _ - 1
  f: Int => Int = <function1>

  scala> val ary = Array(1,2,3)
  ary: Array[Int] = Array(1, 2, 3)

  scala> ary.map(f(_ * 2))
  <console>:10: error: missing parameter type for expanded function ((x$1) => x$1.$times(2))
              ary.map(f(_ * 2))
                        ^
You can't think of "_" like a placeholder. It's not. It's to lift an argument of a function.

So simplify it: map in this context wants a Function[Int, Int] right? So f(_ * 2) must return a Function[Int,Int]. But it doesn't probably. _ is lifting some argument out of whatever f is. If you assume _ is an Int, does f take a Function[Int,Int]? No. It takes an Int. So there's no way to parse this that makes sense. It's not just "I have a stack of vars, pull one off the stack and bind it every time I write an underscore, reading left to right". That would be some AST generative grammar hack. That's not what the underscore is. It's simpler and more consistent than that.

What you're probably looking for instead is Function Composition. So something like:

  scala> ary.map(f compose(_ * 2))
  res10: Array[Int] = Array(1, 3, 5)
So why does that work? Because we were able to compose f() into a larger function that satisfies the signature of the argument map[T](Int => T) requires. A lot like a Stream conceptually.

(Is it correct to say in this context f() is a Monad? I'm not sure, I need to sit down and grok the category stuff sometime...)

Or you can write it the long way (calling f inside a new function). But instead of defining "steps" you'd be creating a new imperative function and driving the stack.


That's interesting, but you did not answer my question. I wanted to know how Scala determines where to insert =>. In my example, Scala interprets "ary.map(f(_ * 2))" as "ary.map(f(x => x * 2))" (both of these expressions yield the same error). If it interpreted it the other way, then the expression would work (using your definition of f). Perhaps a better example to use would have been:

    Array((_:Int) * 2)

    ==> (x:Int) => Array(x * 2) ? (correctly typed)
    ==> Array((x:Int) => x * 2) ? (also correctly typed)
As it turns out, Scala goes for the latter. After playing with the feature a bit I'd say the rules are engineered to fit particular use cases, but they can be a bit finicky if you venture outside of them. For instance, "1 + _ * 2" will work, but "(1 + _) * 2" won't. "f(_)(x)" does not mean the same thing as "(f(_))(x)".

I mean, I wouldn't say the feature is bad, but it definitely has a DWIM vibe to it that would make me classify it as a hack.


> In my example, Scala interprets "ary.map(f(_ * 2))" as "ary.map(f(x => x * 2))"

I think this is where I'm not communicating what I'm trying to say very well. What I'm trying to say is that that statement is false.

The underscore isn't a placeholder saying "inject a Function here". You probably didn't mean that exactly of course, but it's a programming language; it pays to be a bit pedantic I think.

It's easier to understand if you work backwards maybe.

Is this valid?

  (_:Int) * 2
Of course. It's a Function[Int,Int]. The important part is the asterisk method that's being lifted to a Function.

So now you take that perfectly valid piece of code lifting a method to a function and you pass it to Array.apply:

  Array.apply((_:Int) * 2)
Why should your code start behaving differently? That wouldn't be consistent at all. You had a Function[Int,Int] before, but now that you've passed that function value to Array.apply instead of getting an Array[Function[Int,Int]] as you would in every single other case, you get a Function[Int,Array[Int]]?

What sense does that make? That seems like straight up voodoo.

There's some syntax supporting this feature. But not so much as you think. It's just the "lift". Type-inferencing lets you get away with what looks like a little more sometimes.

I dunno. Maybe that's helpful. Maybe not.


> The underscore isn't a placeholder saying "inject a Function here".

It is. Look at the error message:

    error: missing parameter type for expanded function ((x$1) => x$1.$times(2))
It did inject a function, it's right there, in plain text. Then it did type inference. I mean, what else is "lifting" the asterisk to a Function supposed to mean, if not injecting a function around a placeholder? And what about "1 + _ * 2"? What is it lifting? The asterisk? No. It is lifting more than that.

> You had a Function[Int,Int] before, but now that you've passed that function value to Array.apply instead of getting an Array[Function[Int,Int]] as you would in every single other case, you get a Function[Int,Array[Int]]?

That's besides the point. The question is, when the parser sees "_", how much of the context does it grab along with it? In other words, I know it's lifting stuff, what I want to know is how much it lifts. Here's another example:

    f(_, 2)(3)

    ==> x => (f(x, 2)(3)) ?
    ==> (x => f(x, 2))(3) ?
Scala does the former. That's a legitimate choice given common use cases, but the latter is simpler, preserves the invariant that "a(b) <=> (a)(b)" and has use cases as well, e.g. to make an expression like "f(super_long_expression, 2)" more readable.

> But not so much as you think. It's just the "lift". Type-inferencing lets you get away with what looks like a little more sometimes.

I still don't see what type inference has to do with this. The error message makes it clear that the lift is done before type inference kicks in. In a dynamic language, you would stop at the lift, but it would otherwise work just the same.


> It did inject a function, it's right there, in plain text. Then it did type inference.

Yes, the inferencer/type-resolver runs after the parser (there's a recent thread on the ML about a Parboiled parser you might find interesting BTW). My point was this is parsed. It's an AST at that point, it just hasn't resolved the types yet. It's not a source preprocessor. To me that implies limitations, but also an expectation of consistency.

> ... what about "1 + _ * 2"? What is it lifting? The asterisk? No. It is lifting more than that.

Yes! Paste your code into the REPL and see what you get:

  scala> 1 + _ * 2
  <console>:8: error: missing parameter type for expanded function ((x$1) => 1.$plus(x$1.$times(2)))
              1 + _ * 2
This is a real hairy example IMO since it's conflating the mechanics with optional parenthesis and dot-less method calls, but we can break it down just the same. Since Int.+ takes a single argument, then it must parse as:

  1.+(_ * 2)
NOT

  (1 + _) * 2
The first example is trying to lift $times from something. The second is trying to lift the instance-method that's already bound to 1:Int to a Function0[Int].

They're doing exactly the opposite thing right? It's like assuming that the LHS and RHS of an assignment are interchangeable. They're not. One is a label and one is a value.

> how much of the context does it grab along with it?

None. In this position, it's the beginning of an expression. It doesn't escape the scope it's defined in. Just like in the 1.+(_ * 2) example, what comes before it doesn't matter. It's the beginning of an expression.

> I still don't see what type inference has to do with this.

I probably just complicated things bringing it up. ary.map(_ * 2) vs ary.map((_:Int) * 2). I feel like maybe it's not understood that those do the same thing. The only difference is the type is inferred from the signature of map[T] (simplified minus CBF) on the first.

I'm not trying to debate that sometimes it's frustrating that I can't just:

  documents foreach(couchdbActor ! Queue(json.merge(_)))
Intuitively, it seems like it's possible it could work. And it might if the underscore was accomplished through code-rewriting of some sort right? But it's not. So what I'm doing is:

  documents foreach(couchdbActor.tell(Queue(JsObject => JsObject))
I'm not satisfying the signature of foreach because tell() isn't returning a Function[JsObject,Nothing], and I'm passing a Function[JsObject,JsObject] to Queue.apply (case class Queue(payload:JsObject)). Neither of those things makes sense. So what I do instead:

  documents foreach((couchdbActor ! _) compose Queue compose json.merge)
Breaking that down, I've got:

  (couchdbActor ! _) // This just lifts the tell method, which gives me a Function[Any,Unit].

  Queue.apply // Function[JsObject,Queue]: I can compose that with my lifted tell to get a Function[JsObject,Unit]

  json.merge // Function[JsObject,JsObject]: Doesn't change the signature: Function[JsObject,Unit]
Now foreach() gets passed a Function[JsObject,Unit]. Which is exactly what it's expecting.

When you break it down, you have to keep a "stack" in your head. Which can get hard sometimes (at least for me), but the underlying consistency about what's actually happening makes it simple to do. It's like a basic addition problem:

  27 + 34 + 192 + 11
I expect for most people that takes some deliberate thought, breaking it into smaller operations, carrying the result in your head as you go. But it's still simple, because it's consistent.

> In a dynamic language, you would stop at the lift, but it would otherwise work just the same.

I was going to attempt to write a Ruby analogue, but I can't actually figure out a clean way to do that.

I dunno. I feel like you probably understand all this just fine. You just don't like it. You'd prefer a sort of source-rewriting approach? So maybe Scala just isn't the language for you. I feel like once you understand the mechanics of the underscore, it's pretty straight-forward. Similar to beginners getting tripped up on for-comprehensions:

  val o = Option(1)
  val f = Future.successful(List(2,3,4))

  for {
    n <- o
    l <- f
    x <- l
  } yield n + x
That trips up everybody at some point right? If you understand that it's just map, flatten and filter, it's clear why that doesn't work though. How do you say: Option(Future(2)).flatten? What would that do? Can't work. It may be initially frustrating for new users, but it's consistent. If you study the mechanics, then it becomes second nature. The mystery disappears. That's really all I was trying to do here. Hopefully someone finds it useful.

Side note: This is a big part of why I find for-comprehensions mostly useless. They do nothing you can't already do, and they introduce what can look like magic. They're pure sugar, except more often than not, they're also longer to write. Plus they're mostly useless outside of testing. How often do you want a Failure to just throw? Or a Play Action to return a None? More often than not, you're gonna want to fold(), getOrElse(), match { case None => NotFound; case Some(foo) => ... Ok(finalResult) } etc. But I digress. :-)


> This is a real hairy example IMO since it's conflating the mechanics with optional parenthesis and dot-less method calls, but we can break it down just the same. Since Int.+ takes a single argument, then it must parse as:

> 1.+(_ * 2)

No, my point is that this is not how it parses. Look:

    ary.map(1 + (_:Int) * 2)
    res3: Array[Int] = Array(3, 5, 7, 9, 11)
    ary.map(1.+((_:Int) * 2))
    error: overloaded method value + with alternatives:
      (Double)Double
      ...
In the first case, the parser lifts _ over $times AND over $plus. The function is being lifted over both of them. The optional parentheses break the feature if you insert them: "1.+(_ * 2)" tries to add one to a function, which is not a valid operation.

>> how much of the context does it grab along with it?

> None. In this position, it's the beginning of an expression. It doesn't escape the scope it's defined in. Just like in the 1.+(_ * 2) example, what comes before it doesn't matter. It's the beginning of an expression.

If that was true, then "1 + _ * 2" would be invalid, and "f(_)(1)" would be equivalent to "f(1)". But the former is valid and the latter doesn't work like that.

> I was going to attempt to write a Ruby analogue, but I can't actually figure out a clean way to do that.

There is a limited Python analogue:

https://github.com/kachayev/fn.py#scala-style-lambdas-defini...

That only works insofar that the _ placeholder object can seize control over the expression, so "f(_, 2)" won't work unless f explicitly handles _. Implementing the full feature would require language or macro support, but my point was more that a new dynamic language could trivially support it.


Of course, in a more functional language you could perhaps just write "ary.map( * 2)"

Wait, isn't that already possible in Scala?

    ary.map(2*)


For only several characters more the second form is much clearer to many more programmers. I think it's worth it.


Personally I would go for something more like this:

    STDIN | /\w+/ | {|each match|
      /house/   => "foo"
      /car/     => "bar"
      "literal" => "baz"
      _         => "dib"
    }
Where "each x" in a parameter list wraps the body in a map and uses x as the loop variable, and "match" in a parameter list instructs that this is the parameter we are doing pattern matching on in the body (no need to name it).

I think this is a more flexible solution, and more modifiers than each could be introduced. For instance, sum of squares could be:

    seq(100) | {|each x| x * x} | {|reduce (x = 0, y)| x + y}


Sounds like a job for structural regex: http://doc.cat-v.org/bell_labs/structural_regexps/se.pdf


Yes, this paper is interesting.

It shows well how a slight shift on the problem expression can lead to a more powerful, simpler to use and general solution. In this case, the shift is to design stream processors (like `sed` and `awk`) to work on streams of text chunks defined by regular expressions rather on streams of lines matching regular expressions.

Regarding this concern, it's hard to say what is the plan of Matz for Streem with a single code example. But the fact, that this sample code processes a stream of integers, lets hope a tool with more flexibility than `sed` or `awk` to specify what is processed along the streams.


Take a look at TXR: http://www.nongnu.org/txr/

The link was posted on HN some time ago and was generally well received. I didn't have the time to use it much yet, but it looks very nice. Well, unless you hate lisp.


I'd like to recommend a python command line utility called pyp. It allows you to do python string manipulation on text streams using the standard pipe operator.

https://code.google.com/p/pyp/


Yes, pyp is interesting. So are some other roughly similar Python tools for doing pipe-like stuff. I had blogged about some of them here, a while ago, including pyp, osh, and pipe (not the Unix pipe system call, but a Python module):

Some ways of doing UNIX-style pipes in Python:

http://jugad2.blogspot.in/2011/09/some-ways-of-doing-unix-st...

And that later inspired me to create an experimental tool called pipe_controller:

Swapping pipe components at runtime with pipe_controller:

http://jugad2.blogspot.in/2012/10/swapping-pipe-components-a...

The above post links (recursively) to a few others, also by me, describing a couple of other ways of using pipe_controller, including this one:

Using PipeController to run a pipe incrementally:

http://jugad2.blogspot.in/2012/09/using-pipecontroller-to-ru...

and also includes the link for it on my Bitbucket account.

There's also plumbum, another such Python module.


I found it cumbersome to do branching in most of the existing python stream tools. I made an alternative, that allows you to use curly brackets for indentation:

https://github.com/ircflagship2/pype


This inspired me to build a similar tool which runs javascript instead.

https://github.com/aynik/sx


" new stream processing language"... which are the olds?

I'm playing with the idea of build a language, where the functions are unix-like, with STDIN, OUT & ERR. So, instead of raise a exception, it put data in ERR... and make it easy to compose them.


There don't seem to be that many that work on text streams, which is what I was referring to. The ones I am aware of are awk, shell scripting languages(bash, zsh, ksh, etc.), and the ed like DSLs( sed, ed, sam, etc.). I think there are others that are not as widely used.

If you google stream processing language you will also get a whole bunch that are not text specific. They are definitely worth looking at if you are interested in building any kind of stream based language.



If you ever meet Matz, talk to him about programming languages. While he gets some flak for all the problems of his original hobby project (Ruby), he obviously loves programming languages and gives things more thought then people give him credit for. I had the chance to talk to him while I was still a student and full of ideas how the language could be made "better" and he shot them all down. For good reasons, as I know nowadays. So I always love seeing him building languages.

I shared a small story about him and languages quite a while ago, I guess it fits here as well: https://news.ycombinator.com/item?id=6562979


Does he really get a "lot of flak" for Ruby? While I know not everyone loves Ruby, it seems crazy to me that people would denigrate Matz on a personal level...to me (admittedly, a novice in designing languages), Ruby always seemed well-thought out...that is, the trade-offs do not seem out of line given the philosophical benefits, and not everyone can make claim to turning a personal project into a worldwide language.

Also, he seems like a nice guy, not the type to be drawn into the kind of flareups in which he would draw flak.


As someone who loves Ruby dearly, there are a number of severe flaws that are hampering Ruby now.

The thing is they are hampering Ruby now because people want to use Ruby for all kinds of things he probably never intended it for.

E.g. I'm on a crazy multi-year journey towards doing an ahead of time compiler for Ruby. Ruby is not designed for that. It has dozens of issues that makes it incredibly hard to do that efficiently compared to many other languages.

At the same time, MRI was incredibly inefficient in earlier incarnations because it was simple. It interpreted the syntax tree. Easy implement. A nightmare to make fast.

Ruby's grammar is also a massive pain for people who want to implement Ruby. Great for developers mostly until you run into some of the hairy corner cases that are often a result of trying to be incredibly clever to make things flow very naturally when writing Ruby most of the time, which has led to a lot of byzantine rules.

I wouldn't give Matz flak over it, simply because some of it are part of what makes Ruby so pleasant to use, and others are simply artefacts of him meeting his needs while implementing Ruby rather than designing it to meet some ideal that wasn't necessarily very relevant to him then.

But I'm not surprised (a bit sad, but not surprised) if someone makes it personal.


I don't think so many people have a problem with Ruby the language so much as the culture around Rails.


What would be the point of blaming Matz for that?


Well, quite.


Anyone who does anything significant will inevitable get flak from somewhere.


See the amount of people complaining about the missing spec for Ruby in this thread, which is his doing. At some point ~50% of the comments here were downvoted.

I should clarify that I meant Ruby the implementation, not Ruby the language.

Edit: I changed the word to "some", maybe that was a bit of a hyperbole. I see quite some unreflected bashing though.


Matz worked on a spec for the Ruby language, which is now an ISO standard: http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_...


Thanks, Yehuda. I am very much aware of that, but that fact doesn't change people from complaining.

You of all people should be aware that the spec is extracted from the implementation far after the language got popular, which is _exactly_ what people are complaining about.


I'd be careful with people shutting down my ideas up front, in casual conversation.

Sure, you may end up discarding your ideas yourself, but if you instead sought to implement them you could learn something in the way, maybe turn them into something that would actually make sense. That would not happen if you just followed advice from some authority and simply decided not to experiment with your ideas any further. So I'd not discard advice from knowledgeable people, but I would not take it as a final word either.


*Yukihiro Matsumoto, creator of Ruby

I'm not sure everyone is familiar enough with Ruby to know who Matz is.


If they don't know who Matz is, are they likely to know who Yukihiro Matsumoto is?

EDIT: Interesting fact: I believe this is now my most downvoted comment in five years on HN, at effectively -7. Never would have guessed. I'm not exactly sure what it says, but I thought it was an interesting data point.


No, but the sentence as a whole does explain who he is:

"Yukihiro Matsumoto, creator of Ruby"


Probably not, but the epithet `creator of Ruby` gives more context.


I, for one, remember the full name but not "Matz"


That's where I am, I didn't realize it until I got down to the copyright.


same here


I agree with you chc. I've been using ruby for 3 years and would not have been able to retrieve first name from memory!


Why do most implementations of FizzBuzz special case % 15? I haven't ever really understood this. Maybe it's just my math-y background, but it always seemed to me you should just check mod 3 and mod 5 without an else between them, concatenating Fizz and Buzz.

Can anyone else comment on this? Most canonical FizzBuzz programs special case 15, and I don't get it.


Well, its more direct in that, while it increases the number of branches, it minimizes the number of statements executed on any branch. It's also the solution that maps most directly to the problem statement, and, absent a strong technical reason to do otherwise, a direct mapping from requirements to code is a good thing.


I disagree that it maps most directly to the problem statement. You're performing a common factor computation in your mind, which may be more difficult given numbers other than 3 and 5. In my opinion, pattern matching offers the most direct solution and comes with an abundance of compiler optimizations. Here's an example in Rust...

        for i in range(1i, 101) {  
            match (i % 3, i % 5) {  
                (0, 0) => println!("Fizzbuzz"),  
                (0, _) => println!("Fizz"),  
                (_, 0) => println!("Buzz"),  
                _ => println!("{}", i),  
            }  
        }


> I disagree that it maps most directly to the problem statement. You're performing a common factor computation in your mind, which may be more difficult given numbers other than 3 and 5.

Well, sure, explicitly calling out i % 15 rather than (i % 3) && (i % 5) or the equivalent has that problem.

> In my opinion, pattern matching offers the most direct solution and comes with an abundance of compiler optimizations.

Pattern matching is not available in many languages, but, sure, where its available, its a great choice. Note that this still has a distinct case for the case where both % 3 and % 5 are true, rather than just testing those cases independently and sequentially and concatenating, so I think it falls into the general class of solutions I was describing.


I'll take an extra branch over concatenation and dealing with new line chars any day.


The solution in Haskell is quite clean, I believe.

  fizzBuzz n 
     | n `mod` 15 == 0 = "FizzBuzz"
     | n `mod` 3  == 0 = "Fizz"
     | n `mod` 5  == 0 = "Buzz"
     | otherwise       = show n

  main = mapM_ (print . fizzBuzz) [1..100]
I agree with you about generalizing pattern matching for less simple cases. Your example brought to mind view patterns, about which Oliver O'Charles had a nice writeup recently [1]. Nifty little extension.

[1] https://ocharles.org.uk/blog/posts/2014-12-02-view-patterns....


Using F# pattern matching:

    let buzzer number =
       match number with
       | i when i % 3 = 0 && i % 5 = 0 -> "FizzBuzz"
       | i when i % 3 = 0 -> "Fizz"
       | i when i % 5 = 0 -> "Buzz"
       | i -> (sprintf "%i" i)

    for i = 1 to 100 do
        printfn "%s" (buzzer i)


If only rust had (optional only, please) fall through on matches, then you could skip a whole line :D


just because 15 HAPPENS to be a concatenation of the output of 3 and the output of 5 today doesn't mean it will be tomorrow. If I said "say Fizz for multiples of 3, Buzz for multiples of 5, and 'this is a silly coding problem' for multiples of 3 and 5 then you'd have to rewrite your code.

Some of us know that clients ALWAYS change their minds, specs are rarely equivalent to the end result, and code against future changes that are trivial to account for in advance.


For any implementation you can manufacture an example which will require a person to re-structure their code.

Additionally, using %15 is not DRY. If the spec changes from saying "Fizz" on multiples of 3 to saying "Fizz" on multiples of 4, then you will have to also update 15->20. If you forget to do this, you have a bug.

The correct implementation is dependent on the problem's context, and such context is not available with the FizzBuzz problem.


Part of my point (which i obviosuly didn't communicate well) is that Fizz concatenated with Buzz is a premature optimization. It's the developer taking advantage of a linguistic coincience. The instructions are to output 3 different strings based on %3, %5, or %3 and %5. I have never seen a set of fizzbuzz instructions that actually specified that the last option should be a concatenation of the 1st two. It's always specified as a 3rd string that people independently notice is a concatenation of the two.


If you're using something like JS, where strings are truthy, you can always do something like this...

https://gist.github.com/tracker1/d58fa1f83ab17d37eb2c


This is absolutely true for real-world coding. However, for the purposes of an exercise, I think the fact that 15 was chosen was not an accident, but is part of the exercise. Does the coder recognize the relationship between the multiples, and recognize the ability to optimize by removing an extraneous comparison?

It's kind of a trivial test for that sort of recognition, but anything as small as FizzBuzz is going to be rather trivial.


As with most of these little tests, the actual question is, "Can you explain your decision, and does your explanation make sense?"

Really, the FizzBuzz test is a check to see if someone is bullshitting when they say they can code. Testing deeper than that is expecting too much of it.


It's to neatly handle the line break.


I suppose it's a style issue as to whether an extra mod is better than checking if one of the triggers has passed. I would propose it's more DRY to do it the short way though.


It's not any shorter. The extra check as to whether or not to print the line break cancels out the mod 15 check. In my opinion, it's cleaner to have three conditionals of the same type than two checking mods and a third checking the OR of the first two.

Of course, it can be actually shorter with a goto.


Whether you print the number, Fizz, Buzz or FizzBuzz you are going output a line break, so I'm not sure what you would be checking for. Output \n unconditionally.


Then you call `printf` twice every loop instead of once. `printf` is buffered so you aren't making two system calls, but you are still making two function calls.


Maybe it's a performance/brevity compromise, but the latter is the issue the I addressed. The least-calls solution would probably be to print the whole output as a single string constant.


then you'd get \n \n fizz\n \n buzz\n rather than fizz\n buzz\n


The canonical FizzBuzz wants

1\n2\nFizz\n4\nBuzz\n...\nFizzBuzz\n

If you did not print anything for some cases, you're right that it could not be unconditional (though it could still be done at the end with a flag).


oops, yes, my mistake.


In some languages you can hide the extra check:

for i in range(1,100):print("Fizz"(i%3==0) + "Buzz"(i%5==0) or i)


For others who are confused by this syntax, it appears that there are *'s that got eaten and turned into italics.


    string output = numb.toString();
    if(numb % 3  == 0) ouput = "Fizz";
    if(numb % 5  == 0) output.Append("Buzz");
    write output;
It is possible to use two checks to cover all three fizzbuzz components.


5Buzz


Easier to understand for beginners, I suppose.

For the record, I have utterly no math background (hardly passed algebra), but I also agree that checking only 3 and 5 is the better solution and is how I've always written FizzBuzz.


Another thing i dont understand is why people hardcode 15. I would rather write (3 * 5) and let the compiler figure out that 3 * 5=15. This way i think it more clearly states where the number 15 comes from. Any reason to write 15 over (3 * 5)?


interesting point, but I feel the same could be said for (3 * 5)


I do that too. It seems more in the spirit of the problem. For an alternate problem, in which you are asked to print strings a, b and c (!= a concat b) if the number is divisible by 3, 5 and 15 respectively, it makes sense to special case 15.


Because of the requirement to print the number if it is divisible by neither. Here:

    if x isDivisibleBy 3: print "Fizz"
    if x isDivisibleBy 5: print "Buzz"
and.. how do I now print x in the neither case? I can't 'else'. I could make a long if not divisible by either if expression, but that's less easy to read than an if/else chain that starts out with 'if divisible by both, print fizzbuzz'.

If fizzbuzz was: Print FizzBuzz for multiples of 15, fizz for multiples of 3, and buzz for multiples of 5, and nothing otherwise, I bet you'd see the above pseudocode far more.


[deleted]


noice, here's my version

    print("\n".join([("FIZZ"*(i%3==0)+"BUZZ"*(i%5==0)) or str(i) for i in range(1,101)]))


I would hope never to find a bomb like that in the code I'm maintaining.


Really? I find that one line far more pleasant than the Python example above it, with variables being assigned, etc. Both are quite clear in their intention, I think, assuming you recognize "join".

I don't even recognize the language of the one-liner, but it makes perfect sense to me as a reader. It looks like a Perlish solution, but the string method is strange, I think. Maybe Perl 6 or Ruby?


The one-liner is valid Python. It's slightly non-idiomatic in that it uses a list-comprehension where a generator expression would do, and uses range instead of xrange (in Python 2.x; in 3.x range is the idiomatic alternative).


it's also non-idomatic in that is uses string-multiplication and ==0 where an "if not" ternary expression would be much cleaner.


Yes, a ternary would be clearer, to me. I wasn't sure what was happening with the * (and thought maybe it was some use of the "whatever star" in Perl 6, when I was thinking it might be Perl 6, though it doesn't look like any use of the whatever star I've seen), but I just assumed it made sense to someone who knew the language well. It's been a decade since I worked in Python with any regularity.


I would probably use a named function call in real code, but the 'text * (expr == 0)' idiom is pretty clear to me as a way to conditionally include text based on whether a condition is true or not.


It's Python as well.


I found the one-liner perfectly readable. List comprehensions, short-circuiting OR and str.join are pretty idiomatic Python.


Similar:

  print '\n'.join(['FizzBuzz' if x % 15 == 0
      else 'Fizz' if x % 3 == 0
      else 'Buzz' if x % 5 == 0
      else str(x) for x in range(1, 101)])


Starting now with this programming thing, right?

This is straight poser code -- inadvisable in any real life situation, and not that robust either.


slightly shorter javascript:

  for (var i=0; i++<100;) console.log((!(i%3)&&'Fizz'||'')+(!(i%5)&&'Buzz'||'')||i);


I'm not really a programming language expert, but it seems to me that having an implementation being the spec wouldn't be a good idea. If the Streem implementation has a bug, then the bug becomes the authoritative behavior. Any platform specific quirks would also make it difficult to have defined behavior.


Yep, welcome to Ruby!

To be fair, a spec with tests was reversed out of the ruby implementation[1], so things have improved a bit.

1. http://rubyspec.org/


Implementations are nearly always the spec when a language is young. You want to be able to experiment and make changes. Then as it matures, you typically get a spec.


Sure, but this is not finished at all. Starting with the spec instead of a prototype implementation sounds like a very very limiting and unrewarding design process.


There's an old argument about worse is better. The gist is, doing things the right way is hard and takes a long time. sometimes, it's better to just get something simple out there and deal with the problems later.

http://www.jwz.org/doc/worse-is-better.html


It looks like a weekends worth of work for Matz, I doubt he's even thinking of a spec at this point.


Right now, the closest thing to a spec is the sample FizzBuzz code (as an implicit spec that "this code will solve FizzBuzz"); there is no implementation (just work-in-progress parser/lexer code.)

So, while I'll agree that there are issues that come from the implementation being the spec of a language in general, I would say we are well earlier than the point at which we can identify that as a problem with Streem.


Welcome to PHP. The Zend Engine 2 is basically the spec, even though Facebook has recently (couple of weeks ago) started writing a spec to make sure their HHVM is compatible.


At the very least, it's going to be amazing to watch a master language designer build a new language from the ground up.

That said, I'm incredibly optimistic about a new Matz language. If I was going to guess, the syntax will be much lighter and the semantics will make VM optimization much easier than in Ruby.


dude, the man publishes 3 files, claims he's going to make a new programming language, and it hits the front page of hackernews -- Matz has some power there.


It's not power, it's credibility. If someone who has never done anything says they are going to accomplish something large people are skeptical. If someone who already created a successful and highly adopted programming language says they are making a new one people are rightly intrigued.


Would be cool if it would incorporate some thinking and concepts from Flow-based programming [1], as that is AFAIK the most comprehensive architecture covering all the aspects of asyncronous concurrent processing that one might run into (multiple in/out-ports, channels with bounded buffers, sub-stream support, etc etc).

[1] http://www.jpaulmorrison.com/fbp/

[2] http://en.wikipedia.org/wiki/Flow-based_programming


Reminds me a lot of Elixir's |> operator, which does the exact same thing. Nice! Curious how it'll turn out to compare with Elixir on other areas.


I've never heard of Elixir, but I always assumed that the |> originated in F#. Ocaml has it pretty much standard, too, although you can define it in one line in Ocaml:

let (|>) x f = f x


Also in Haskell [0]:

    (|>) :: Seq a -> a -> Seq a
    O(1). Add an element to the right end of a sequence. Mnemonic: a triangle with the single element at the pointy end. 
It seems to be in the containers package since 2005: [1]

[0]: http://hackage.haskell.org/package/containers-0.5.5.1/docs/D...

[1]: https://github.com/haskell/containers/commit/1e61853dbd4b9fc...


Which has a totally different meaning from the F# usage.

'|>' from F# is the same as 'flip $' in haskell, or (&) imported from the lens library


In Elixir, |> does not flip arguments. It lets you chain together multiple functions by inserting the result of the previous function as the first argument of the following function. Here's an example from http://www.theerlangelist.com/2014/01/why-elixir.html

The following code computes the sum of squares of all positive numbers of a list:

list |> Enum.filter(&(&1 > 0)) |> Enum.map(&(&1 * &1)) |> Enum.reduce(0, &(&1 + &2))


F# doesn't flip arguments either. It's an operator:

    let (|>) x f = f x
Is basically saying:

    x |> f  is equal to  f x
So in F# it's the same as Elixir, but the value x is applied to the function f (passed as the last argument). i.e.

    list |> List.filter (fun x -> x > 0)
         |> List.map (fun x -> x * x)
         |> List.reduce (fun s x -> s + x)


I was going off what platz said in another comment, that |> flips the arguments to |> in the same way Haskells flip function does, which I thought the type signature of the the F# |> also indicated. I'm sorry if I'm misunderstanding things.

I think the difference though is that the |> in Elixir is actually a macro that modifies the following function call's first argument.

So list |> Enum.filter(&(&1 > 0)) doesn't end up using filter as a curried function as one would find in Haskell:

Enum.filter(&(&1 > 0)) list

The end result is actually:

Enum.filter(list, &(&1 > 0))


Could be cool, but Elxiir has the EVM under it, and that is some quality engineering. Ruby is good but kind of clownshoes.


Streem is inspired by Erlang and Ruby, and currently the only code is initial code for the parser and lexer, and there's no information about what the runtime will be like.


Elixir is inspired by Erlang/OTP and Ruby and the runtime is Erlang BEAM/HiPE. :-)


I love Ruby and I love Matz. With that being said there are some things that Ruby struggles with. I know that there have been some conversations among the core on bringing in more functional concepts to Ruby....at least since April. To me this says that Matz is coming to the conclusion that we may need a new language to get functional right.

While I am sad to see that Ruby may be superseded by a new language I'm really happy to see Matz leading the way with one of the solutions. In the Ruby community we have a expression "Matz is nice and therefore we are nice". That has set the tone for the community in ways that have never been the same in some of the others.

As someone who has had the opportunity to talk with Matz on multiple occasions and work with the Ruby community it would be great to see this as a natural evolution of Ruby and the people who love it... As I have started to move on to working with more functional languages etc. I have started to move away from doing Ruby, but if the community can continue on and evolve with a new language that would be awesome!


> I know that there have been some conversations among the core on bringing in more functional concepts to Ruby....at least since April.

Only since April? Has there been a non-bugfix release of Ruby that didn't bring in more functional concepts to Ruby since, well, ever? AFAICT, bringing more support for ideas that come from the FP world has been one of the perennial drivers of progress in Ruby, not a new issue that emerged this spring.

> To me this says that Matz is coming to the conclusion that we may need a new language to get functional right.

I think that's reading a lot into it. Matz experimenting with something doesn't mean that the intent is to replace Ruby with it.


> "Matz is nice and therefore we are nice"

This non sequitur annoys me. Deconstructing it:

* A: "Matz is nice": Let's say we all agree this is true.

* B: "we are nice": i.e., the ruby community is nice.

* P(A -> B): (A therefore B) is a slogan, so I assume the proposition P is believed to be true. Is it?

In order for P to be true, the only option is for B to be true as long as Matz keeps being nice. Assuming Matz is still good-natured, is not hard to find counterexamples for B (every big community have some less-than-nice people). So the facts tell us that P is false.

Alternatively, if you assume that Matz is not nice, then, without mattering if "we" are nice or not, the slogan holds true (modus ponens [1]).

Anyway, my point is that the slogan is as silly as this rant :p.

1: http://en.wikipedia.org/wiki/Modus_ponens#Justification_via_...


I don't understand the original quote to be a proposition. I think what it's trying to say is "Matz is nice so we too aspire to be nice". It's not descriptive, it's perscriptive.


I think the intent was more along the lines of "Matz is nice and therefore we should be nice", but in making it shorter and snappier it now uses a more ambiguous piece of the English language.


"Copyright (c) 2015 Yukihiro Matsumoto" - bit early, ey? ;)


>the software is related to the magazine article of 2015 issue of the (Japanese) programming magazine

https://github.com/matz/streem/commit/1c8189f9e1df3289801b28...


Why? I think it's pretty normal to drop this in as boilerplate on new projects.


Presumably because it's not 2015 yet, at least according to my calendar...


Note the year ;)


Ha! You got me there :)


wrong year


Only three weeks early.


It’s from the future!


And... here’s an implementation: https://github.com/mattn/streeem

Well, that was quick.


Heh, got to love the repo message "Sorry, Sorry"


We did a similar thing to the awk source. Made awk scripts first class, you could pipe them to each other.


Can someone help explain what's going on here: \"([^\\\"]|\\.)\" [seen here in context](https://github.com/matz/streem/blob/master/src/lex.l#L49).

Now it seems to be finding literal strings (so "strings" e.t.c.). That would explain the literal double quotes on either side. so without that we get: ([^\\\"]|\\.) so zero or more repeating versions of [^\\\"]|\\.

What I don't understand is why there is the explicit or \\. construct there, as this seems unnecessary. Am I missing something? also, why does it seem that strings cannot have either literal \ or literal " in them?


Because if there's a \ you want to skip over the next character, even if it's a ", but if it's not escaped then you want " to be the end of the string.


You're completely right, I think I had read it as \. rather than \\. and thus didn't understand it. Thanks for helping me get it straight in my head.


I like the idea of dataflow or stream processing ideas. I would love if you could make the connector pieces smarter so that you were enforcing a contract between the piping mechanisms. I believe you could build some very interesting systems with that approach.


Awesome I hacked something like this together a couple years ago using ruby and gnu parallel

https://github.com/charlesmartin14/gnu_parallel

it is badly needed


This is an excellent example of using a parser as a language. Whether it has any legs depends on whether it beats existing tools on some front (sed, perl, ruby, etc). The concurrent angle is interesting, but I have found a multi-process approach to stream data to be more efficient than most concurrent single-process implementations. For example, with DAP (https://github.com/rapid7/dap), we found that GNU Parallel + Ruby MRI was more effective than a concurrent language such as Go.


Tab (https://bitbucket.org/tkatchev/tab) is another interesting recent text processing language.



It's definitely a good idea. Pipes are both very powerful and very simple to use and debug, yet they are not very common in general purpose programming languages (examples?). I'm not surprised that someone is trying to build a language around them. I'll follow that, but for now it's a bit too early to judge.


See node.js streams.


Great idea, but I'm a bit disappointed in the syntax. Composed chains in Ruby are much nicer to look at.


For people interested, you can do something like this in Ruby right now that has nearly the same syntax:

http://pragdave.me/blog/2007/12/30/pipelines-using-fibers-in...


I hope it will support really big integers as in Haskell and better floating point number calculation, although I cannot foresee the power right now. Because it is concurrent, it may be very useful in scientific computing, whether in small or large scale.


    true{TRAIL}	return keyword_false;
Well, this will be fun to debug


This is indeed one of the nicest typos I've seen, but note that there's already a pull request. I suppose bugs can become shallow when your 3 file repo reaches the first page of hn. https://github.com/matz/streem/pull/4/files


-> in clojure or hy deals with that nicely:

(-> (read) (eval) (print) (loop))

using python-sh in hy this is possible:

(-> (cat "/usr/share/dict/words") (grep "-E" "^hy") (wc "-l"))


Matz, now get started on implementing transducers! https://www.youtube.com/watch?v=6mTbuzafcII


Stream processing languages typically derive from SQL (streamSQL) or prolog (Rapide) ... This one doesn't seem anywhere near as powerful but who knows.


Interesting that he chose a C-like syntax after going the complete opposite direction with Ruby.


Huh? Ruby also has a C-like (algol derived) syntax.

The replacement of "}" with "end" etc, is a trivial replacement, not a different type of syntax.

Lisp, Prolog, etc, would be a non-C syntax.


Obviously you can flatten this inheritance tree any way you like, but while all C-likes are obviously algol-likes I do think it's useful to consider C-like a distinct branch within that.

That said I think you probably want more than just curly braces to define C-like even then.


Now it might be time to have a look into YACC, LEX and maybe BISON and follow Matz' repo :)


Looking to hire a Streem professional, must have at least 5 years experience with Streem.


Great. I have been streem-ing for more than 15 years now. :-)


I see this as a very good alternative to traditional shell scripting languages.


Anything that beats Bash's ugliness I'm all for.


it would also be nice if it had a static type system, but given that it's Matz, it's unlikely...


What kind of areas could Streem be used in?


Why is Stream spelled incorrectly?


Easier to search for, at least.


Sorry, but if you know nothing about Ruby this is pure hype.


Not a fan of the closing braces.


But the opening brace `{` is fine?


Meaning I would rather not have opening or closing.


Didn't Matz mention that he wasn't either, when explaining Rubys syntax?

I'm a fan of indentation based blocks, so this seems like a step back for me :\


What does it matter?


[deleted]


Having littered the battlefield with various versions of Ruby VM, Matz drops the mic, walks off the field


"Then said Jesus unto him, Put up again thy sword into his place: for all they that take the sword shall perish with the sword" (Matthew 26:52, King James Version)

Sony played hard many times, e.g. closing Linux on Playstation 3 devices. I do not condone the attack, although I have no sympathy at all for such kind of companies.



This is newsworthy because Matz is developing a new language, not because such a thing doesn't already exist.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: