¿ªÔÆÌåÓý

ctrl + shift + ? for shortcuts
© 2025 Groups.io

Re: Need Wallaroo help? Talk to me

 
Edited

Thanks for responding so quickly.

The IP adress is indeed on another machine.?
The general problem is how to pipe a live stream from an external IP-adress into wallaroo.
The particular stream in question collects AIS tracks reporting the position, velocity etc. of
sea-going vessels pulled from receiving AIS sattelites.??

My wallaroo module produces output when I feed it AIS from a file.?
Yet, when? *instead* of starting a separate sender I try the suggested pipe, the result is:
?
nc -v 153.44.253.27 5631 | nc -v 127.0.0.1 7010
Connection to 127.0.0.1 7010 port [tcp/*] succeeded!
Connection to 153.44.253.27 5631 port [tcp/*] succeeded!
?
But, alas, no output.

Oh, and I'm using Ubuntu 18.04


Audun?


Re: Need Wallaroo help? Talk to me

 

Hello Audun,

I understand that the address you provided is a data stream provided by the Norwegian authorities and it's not owned by you.?
In a logical sense, it's the source of your data, but it's not a Wallaroo Source. A Wallaroo TCP Source is essentially a listening socket that's intelligently integrated with the rest of the system, and it lives as part of your running Wallaroo app. So it needs to be set up on an IP that you're running on (either 127.0.0.1 for local testing or your public(WAN/LAN) IP for networked operation.

In this case, you'll want to have a proxy process that listens to the official data source IP, and then forwards all packets to your Wallaroo Source. Something like:

1. Start your Machida application, but change the invocation to listen on a local port:
instruction=("machida "
? ? ? ? ? ? ?"--application-module working "
? ? ? ? ? ? ?"--in 127.0.0.18000?"

2. Start a proxy netcat process that will connect to the source and forward to our Wallaroo Source Listener
$??nc 153.44.253.27 5631 | nc 127.0.0.1 8000

You should see data coming in to your decoder function -- but it may not be framed correctly. That's a concern for our next steps, but first try the above and see if you're receving any data into your Wallaroo app.

Good luck and please let us know if you get stuck!
Simon Zelazny





Wallaroo so far is exciting and fun, hallmarks of python of course, but I seem to have hit a roadblock.

Trying to get my wallaroo module to take input from the said IP adress i use the following startupe script

#!/usr/bin/env python
import os
instruction=("machida "
? ? ? ? ? ? ?"--application-module working "
? ? ? ? ? ? ?"--in "
? ? ? ? ? ? ?"--out "
? ? ? ? ? ? ?"--metrics "
? ? ? ? ? ? ?"--control "
? ? ? ? ? ? ?"--data "
? ? ? ? ? ? ?"--name worker-name "
? ? ? ? ? ? ?"--external "
? ? ? ? ? ? ?"--cluster-initializer "
? ? ? ? ? ? ?"--ponythreads=1 "
? ? ? ? ? ? ?"--ponynoblock"
? ? ? ? ? ? ?)
os.system(instruction)

When run, this produces the output (never mind names, Im using the GitHUb examples as templates):

----Creating source Split and Count with | Split and Count source | ||----
Finished handling Split and Count node
Local topology initialized.
?
|^|^|^|Finished Initializing Local Topology|^|^|^|
---------------------------------------------------------
LocalTopologyInitializer.initialize called a second time. Ignoring since this is a single worker cluster.
initializer external: listening on
metrics outgoing connected
Recovery file exists for control channel
initializer control: listening on
|~~ INIT PHASE I: Application is created! ~~|
|~~ INIT PHASE II: Application is initialized! ~~|
TCPSink initializing connection to
TCPSink connected
|~~ INIT PHASE III: Application is ready to work! ~~|
Need ClusterInitializer to inform that topology is ready
Split and Count source attempting to listen on
Split and Count source is unable to listen
This should never happen: failure in /home/local/NTU/aud/wallaroo-tutorial/wallaroo-0.6.1/lib/wallaroo/core/source/tcp_source/tcp_source_listener.pony at line 304
?
All help is gratefully received.?

Best,

Audun Stolpe

Senior Scientist,
The Norwegian Defence Research Establishment


Re: Need Wallaroo help? Talk to me

Sean Allen
 

Hi Audun,

Nice to hear from you.?

The error being reported is that 153.44.253.27 port 5631 is unavailable. Either something else is already listening on it, or its not an ip?address on the machine you are on.
The input address should be the IP address that Wallaroo will receive data on.

When you say:

"I have? an IP adress () over which the Norwegian Coastal Administratiuon is
transmitting live AIS tracks from AIS sattelites. "

What does "transmitting live" mean?

In particular, are you saying that something is already listening on that address and receiving data?
Are you saying that ip address is on another machine?

Also, what OS are you using? If Linux, please include distribution and version of said distribution.

-Sean-



On Thu, Jan 17, 2019 at 9:01 AM <audun.stolpe@...> wrote:
Dear Sean,

I'd like to use Wallaroo for real time surveillance of AIS data, i.e. marine vessels.
I have? an IP adress () over which the Norwegian Coastal Administratiuon is
transmitting live AIS tracks from AIS sattelites.

Wallaroo so far is exciting and fun, hallmarks of python of course, but I seem to have hit a roadblock.

Trying to get my wallaroo module to take input from the said IP adress i use the following startupe script

#!/usr/bin/env python
import os
instruction=("machida "
? ? ? ? ? ? ?"--application-module working "
? ? ? ? ? ? ?"--in "
? ? ? ? ? ? ?"--out "
? ? ? ? ? ? ?"--metrics "
? ? ? ? ? ? ?"--control "
? ? ? ? ? ? ?"--data "
? ? ? ? ? ? ?"--name worker-name "
? ? ? ? ? ? ?"--external "
? ? ? ? ? ? ?"--cluster-initializer "
? ? ? ? ? ? ?"--ponythreads=1 "
? ? ? ? ? ? ?"--ponynoblock"
? ? ? ? ? ? ?)
os.system(instruction)

When run, this produces the output (never mind names, Im using the GitHUb examples as templates):

----Creating source Split and Count with | Split and Count source | ||----
Finished handling Split and Count node
Local topology initialized.
?
|^|^|^|Finished Initializing Local Topology|^|^|^|
---------------------------------------------------------
LocalTopologyInitializer.initialize called a second time. Ignoring since this is a single worker cluster.
initializer external: listening on
metrics outgoing connected
Recovery file exists for control channel
initializer control: listening on
|~~ INIT PHASE I: Application is created! ~~|
|~~ INIT PHASE II: Application is initialized! ~~|
TCPSink initializing connection to
TCPSink connected
|~~ INIT PHASE III: Application is ready to work! ~~|
Need ClusterInitializer to inform that topology is ready
Split and Count source attempting to listen on
Split and Count source is unable to listen
This should never happen: failure in /home/local/NTU/aud/wallaroo-tutorial/wallaroo-0.6.1/lib/wallaroo/core/source/tcp_source/tcp_source_listener.pony at line 304
?
All help is gratefully received.?

Best,

Audun Stolpe

Senior Scientist,
The Norwegian Defence Research Establishment



--
Sean T. Allen
VP of Engineering
WallarooLabs.com

Unlock Data's Potential.
Wallaroo Labs makes it simple to scale data applications efficiently, reliably, and on-demand - without worrying about infrastructure. Get to production fast, innovate rapidly, and operate at a low cost.

Please star us on GitHub.


Re: Need Wallaroo help? Talk to me

 

Dear Sean,

I'd like to use Wallaroo for real time surveillance of AIS data, i.e. marine vessels.
I have? an IP adress (153.44.253.27:5631) over which the Norwegian Coastal Administratiuon is
transmitting live AIS tracks from AIS sattelites.

Wallaroo so far is exciting and fun, hallmarks of python of course, but I seem to have hit a roadblock.

Trying to get my wallaroo module to take input from the said IP adress i use the following startupe script

#!/usr/bin/env python
import os
instruction=("machida "
? ? ? ? ? ? ?"--application-module working "
? ? ? ? ? ? ?"--in 153.44.253.27:5631 "
? ? ? ? ? ? ?"--out 127.0.0.1:7002 "
? ? ? ? ? ? ?"--metrics 127.0.0.1:5001 "
? ? ? ? ? ? ?"--control 127.0.0.1:6000 "
? ? ? ? ? ? ?"--data 127.0.0.1:6001 "
? ? ? ? ? ? ?"--name worker-name "
? ? ? ? ? ? ?"--external 127.0.0.1:5050 "
? ? ? ? ? ? ?"--cluster-initializer "
? ? ? ? ? ? ?"--ponythreads=1 "
? ? ? ? ? ? ?"--ponynoblock"
? ? ? ? ? ? ?)
os.system(instruction)

When run, this produces the output (never mind names, Im using the GitHUb examples as templates):

----Creating source Split and Count with | Split and Count source | ||----
Finished handling Split and Count node
Local topology initialized.
?
|^|^|^|Finished Initializing Local Topology|^|^|^|
---------------------------------------------------------
LocalTopologyInitializer.initialize called a second time. Ignoring since this is a single worker cluster.
initializer external: listening on 127.0.0.1:5050
metrics outgoing connected
Recovery file exists for control channel
initializer control: listening on 127.0.0.1:6000
|~~ INIT PHASE I: Application is created! ~~|
|~~ INIT PHASE II: Application is initialized! ~~|
TCPSink initializing connection to 127.0.0.1:7002
TCPSink connected
|~~ INIT PHASE III: Application is ready to work! ~~|
Need ClusterInitializer to inform that topology is ready
Split and Count source attempting to listen on 153.44.253.27:5631
Split and Count source is unable to listen
This should never happen: failure in /home/local/NTU/aud/wallaroo-tutorial/wallaroo-0.6.1/lib/wallaroo/core/source/tcp_source/tcp_source_listener.pony at line 304
?
All help is gratefully received.?

Best,

Audun Stolpe

Senior Scientist,
The Norwegian Defence Research Establishment


Wallaroo 0.6.1 has been released!

 

Hi all,

Today we released Wallaroo 0.6.1!

The highlight of this release is the?addition of stream windowing to the Wallaroo API!
We do recommend upgrading and instructions can be found in our release notes.

For additional details, please see our release notes:?


- Jonathan


Wallaroo 0.6.0 has been released!

 

Hi all,

Today we released Wallaroo 0.6.0!

The highlight of this release is a complete overhaul of the Wallaroo API to make it cleaner, simpler, and more intuitive. As a result of these changes, this is a breaking release. We do recommend upgrading and instructions can be found in our release notes.

For additional details, please see our release notes:?


- Jonathan


New blog post - Using Wallaroo with PostgreSQL

 

Hello,

This weeks blog post talks about using PostgreSQL with Wallaroo. It's a good starting point if you're interested in using Wallaroo with a SQL database.

check it out here: https://blog.wallaroolabs.com/2018/11/using-wallaroo-with-postgresql/

As always contact us with any questions or feedback,

Erik


Subscribe to the Wallaroo blog

Sean Allen
 

Hi all,

In the not so distant future, we are going to stop posting notices of new blog posts to the user mailing list. Why??

We now have an option to subscribe to the blog. If you subscribe, you'll get notified when we publish new posts.
You can sign up at?

-Sean-


Today's blog article: The Treacherous Tangle of Redundant Data

 

Happy Friday, everyone.? I'm the author of a blog article published today, "The Treacherous Tangle of Redundant Data: resilience for Wallaroo".

A few weeks ago, John Mumm wrote an article that described what Wallaroo does to be resilient in case of a rebootable crash.? But what if the crashed worker cannot reboot?? Today's article describes the data redundancy technique that permits a Wallaroo cluster to recover after a crash with catastrophic data loss.

Today's article:?
John's article:?

-Scott


Wallaroo 0.5.4 has been released!

 

Hi all,
?
Today we released Wallaroo 0.5.4.

The highlight of 0.5.4 is support for Python 3. Users can now use `machida3` to develop Wallaroo applications written in Python 3, see our latest to get started. Although this is a preview release, we are very excited to get it into your hands.


This is a patch release, meaning there are no breaking changes to the existing API. This allows you to drop in your existing Python 2 application(s) into the latest release and take full advantage of bug fixes we¡¯ve made since our last release.

Full details available in the release notes:



- Jonathan?


New blog post - Introducing Connectors: Wallaroo¡¯s Window to the World

 

Hello, this weeks blog post gives an overview of our new Connector APIs. These APIs make connecting to external applications much easier than before. We're excited to hear your feedback

https://blog.wallaroolabs.com/2018/10/introducing-connectors-wallaroos-window-to-the-world/


This week's blog post: Wallaroo clusters on demand

 



If you're looking into setting up a Wallaroo cluster, take a look at how easy it is to do with Pulumi + Ansible.

Cheers,
Simon Zelazny


New Checkpointing Blog Post

 

Hey everyone,

We just put up a new blog post that discusses our recent asynchronous checkpointing work released in 0.5.3.? It goes into some detail about the problems around recovering distributed systems to a consistent global state, some ways you might do this incorrectly, and some reasons why the checkpointing algorithm we settled on is a great fit for streaming systems.

Check it out:?


Need Wallaroo help? Talk to me

Sean Allen
 

Hi everyone,

I hope this message finds you well. I wanted to let everyone on the mailing list know that I'm now doing Developer Relations at Wallaroo Labs. What does this mean for you? Well, if you need help getting started with Wallaroo or making a Wallaroo project successful, come talk to me.

I'd be thrilled to get emails from folks asking how Wallaroo can help solve their Python data processing problems. I'd love to dig into use cases with you. Whatever it is you need (within reason!), I'm here to help you with. Feel free to reach out either via the mailing list or to my email sean@....

-Sean-


ICYMI: Wallaroo is now Apache 2 licensed

Sean Allen
 

Previously most of the code base was Apache 2.0 licensed but some was "source available" and under a non-open source license. Full details in my blog post from last week: https://blog.wallaroolabs.com/2018/10/wallaroo-goes-full-apache-2.0/


Wallaroo 0.5.3 has been released!

 

Hi all,

Today we released Wallaroo 0.5.3.

This is a patch release that includes two very important new features. First, we've released a preview version of the Python Connector API. This allows developers to build sources and sinks without the need to worry about Wallaroo¡¯s internal protocol. We also have a better resilience story: we now use an algorithm based on the Chandy-Lamport snapshotting algorithm that minimizes the impact of checkpointing on processing in-flight messages.

Full details available in the release notes:?

-Dipin


New blog post: "Making Python Pandas go fast"

 

Hi everyone,

It's Thursday, and that's (usually) blog day! This week's post is about paralellizing pandas batch jobs with Wallaroo:?

Cheers,
Simon Zelazny


New blog post: "Streamlining the Wallaroo installation process with Wallaroo Up."

 

Hi all,

We have a new blog post about "Streamlining the Wallaroo installation process with Wallaroo Up."

You can find it at:

-Dipin


InfoQ article about Wallaroo's new consistent hashing technique

 

Hi, everyone.? The folks at InfoQ have published an article that I've written about the new consistent hashing technique that is being added to Wallaroo.

The 0.5.0 Wallaroo release added support for "dynamic keys" [1].? Wallaroo can now automatically recognize new keys and route data to the appropriate partitioned state step.? The routing algorithm used today (including the recent 0.5.2 release) is very general and is unaware of load balancing considerations.? We are now integrating a technique called Random Slicing into Wallaroo to add very fine-grained control over the distribution of keys across Wallaroo worker processes.

The second half of the InfoQ article [2] describes the Random Slicing consistent hashing technique, together with several illustrations to show it adapts to changing cluster membership and load balancing criteria.? The first half of the article describes some earlier consistent hashing techniques, including the one used by Amazon's original Dynamo database and adopted by Riak and other distributed databases.

John Mumm, Andy Turley, and I are excited to bring enhancements to Wallaroo's dynamic keys, load balancing, and crash resilience features to Wallaroo in the next month.? If you have questions about any of this work, please feel free to contact me by email.

-Scott

[1]
[2]?


[New blog post] Real-time Streaming Pattern: Analyzing Trends

 

Hello everyone,

We have a new blog post on another stream processing use case, analyzing trends.

You can read it here:

Otherwise enjoy your weekend!