2013年8月14日 星期三

DataTubine 讀書筆記2 : Server, Source

出處
http://www.dataturbine.org/content/server
http://www.dataturbine.org/content/source

官網有圖比較好懂~~

大概就是要知道資料定義時
Name: 名稱...
Target Server: 總要知道server在那吧~囧a
Channel: 可以有多個~ 我這專案主要是讀資料~ 找對channel才對QQ
Cache Size: 因為不是建server給人用, 就還好

---------------------------------------------------


DataTurbine Server

What is RBNB?

The DataTurbine server is contained in rbnb.jar it is the core of DataTurbine and is used as a center point that applications (sources and sinks) interface with.

It is not a replacement for a database and is designed for speed. Because of this although it is possible to store years of data in a DataTurbine server, for most applications data is also archived in permanent storage in a database.

The acronym RBNB stands for Ring Buffered Network Bus, and is the technology inside the DataTurbine server.  To data sources (applications that generate data), it acts as a middleware ring buffer which stores heterogeneous time-sequenced data. To data sinks (applications that read data), it acts as a consolidated repository of data.  Key to RBNB scalability is each source (ring buffer) and sink (network bus connection) act independently of each other.
The DataTurbine Server

It can be thought of as a series of rotating disks (a ring buffer) with new data being added and old data removed when the archive becomes full.

Source (applications that add data to the server) will specify their own archive sizes and cache size. Each source can specify its own archive and cache sizes.

The archives size specified by a source determines the size of it's ring buffer and how much data is buffered before it is discarded. DataTurbine can use as much storage as a systems physical drives allows. A good value depend on the storage space of the device the server is running on and the needs of the project.

The cache size specified by a source determines how much of it's ring buffer is contained in memory (RAM). This is again determined by the nature of the system is running on and the applications. A cache can increase speed, but a bigger cache does not necessarily mean a faster system.

This approach allows applications to interact with data in near real-time. Sinks can read data as it is collected and display it online, in Matlab, or other applications. Sinks can also interact with the data and move it into permanent storage.

The server is agnostic to the data it receives and can accept heterogeneous data types including numerical, video, audio, text, or any other digital medium. It acts as a black box with sources adding in data and sinks reading the data out.

The server expects an accurate timestamp for every data point. One limitation of this is that data cannot be back-loaded into the server. That means that data has to be entered sequentially and so for a give source each data point has to have a timestamp that is greater than the previous timestamp on record.
What is a Frame

Sizes are specified in the number of frames. Each time a source application flushes data it adds one frame.  A frame is a data structure of one or more channels, with 1 or more data objects per channel.  Thus the size of a frame may be small to large, and may vary frame to frame.

DataTurbine Source
Introduction

A DataTurbine Source (also refereed to as an 'on-ramp') is a program that takes data from a target (for example a sensor or file) and puts it into a DataTurbine server.

A source runs independently from the server as a separate application and uses the network to communicate. It can run on the same machine as the server or across the world.

Each source can contain multiple channels each with its own data type. It controls its own server-side memory and hard drive space allocation
Anatomy of a Source

    Name: Identifies the source
    Target Server: The server the source sends data to
    Cache Size: Each source specifies how many frames of data to buffer for itself in the server's memory (RAM).
    Archive Size: Each source specifies how many frames of data to store on the server's hard drive.
    Multiple Channels: Data stream containing one type of data (for example numeric or video).

    In turn each channel consists of a :
        Name: Identifies the specific channel
        MIME Type: Media type the applications can use to make decisions about the data they are receiving. Each channel can only store one type of data.
        Data: Series of data points consisting of a time and value

Practical Example:

For example let us imagine a simple meteorological tower that measures temperature and humidity on top of a hill. Nearby is a field station that is also measuring temperature. We want to get this data into DataTurbine on a laptop at the field station. Lets go over what we would do.

Assuming we have custom sources for our instrumentation.

    Start a DataTurbine Server on the laptop (rbnb.jar)
    Start a source on the laptop targeting our server that reads data from the meteorological tower and puts it into DataTurbine. This source would contain two channels (temperature & humidity)
    Start another source on the laptop that reads from the local field station and writes puts the data into the server. This source would contain a single channel (temperature)

Our laptop would now have three independent lightweight programs running. And now that we have the data in the server we now need a way to access it. This is discussed in the next section.
PlugIns

PlugIns are a specialized on-request type of data source. Whereas regular sources proactively push data to the DT server, plugins reply with data in response to sink requests forwarded to them via their plugin server connection.
Things to Keep in Mind

Each channel can only have one data type associated with it. Also remember that data cannot be back-loaded into the server. For each channel data has to be entered sequentially and so for a given channel each data point has to have a timestamp that is greater than the previous timestamp on record.


----------------------------------------------------
DataTurbine 伺服器
什麼是RBNB?在rbnb.jar它的核心是DataTurbine被用作一個中心點,應用程序(源Source和匯Sink)接口與的DataTurbine服務器。它不是一個數據庫的一個替代品,是專為速度。正因為如此,雖然它是可以存儲年在一個DataTurbine服務器的數據,對於大多數應用程序數據的存檔在永久存儲在數據庫中。縮寫RBNB代表環緩衝網絡總線技術裡面DataTurbine服務器的。到數據源(生成數據的應用程序),它作為一個中間件異質性的時間序列數據存儲的環形緩衝器。數據接收器(讀取數據的應用程序),它作為一個綜合的數據存儲庫。擴展性RBNB的關鍵是每個源(環形緩衝區)和接收器(連接網絡總線)彼此獨立行事。的DataTurbine服務器它可以被認為是一系列的旋轉圓盤(環形緩衝器)被添加的新數據和舊數據存檔已滿時,將刪除。源(數據添加到服務器的應用程序),將指定自己的存檔大小和緩存大小。每個源可以指定自己的存檔和高速緩存大小。源所指定的檔案大小的大小決定了它的環形緩衝區和緩衝多少數據被丟棄之前。 DataTurbine可以使用盡可能多的存儲系統的物理驅動器允許。一個很好的價值依賴於服務器上運行的設備的存儲空間和項目的需求。由源指定高速緩存的大小決定多少被包含在它的環形緩衝存儲器(RAM)。這又是由該系統的性質上運行的應用程序。高速緩存可以提高速度,但並不一定意味著一個更大的高速緩存更快的系統。這種方法允許應用程序在近實時的數據交互。水槽,因為它可以讀取數據被收集並在網上顯示,在Matlab或其他應用程序。水槽也可以與數據交互,並把它移動到永久存儲。該服務器是不可知的數據接收和可以接受的異構數據類型,包括數字,視頻,音頻,文字,或任何其他數字媒體。它作為一個黑盒子的來源,添加數據和匯讀取數據出來。服務器期望接收的每一個數據點的準確時間戳。這方面的一個限制是,數據不能被加載到服務器。這意味著,數據必須依次輸入,所以每個數據點都必須有大於以前的時間戳記錄的時間戳給源。什麼是框架尺寸的指定的幀的數目。每次刷新數據源應用程序,它增加了一個框架。車架的數據結構的一個或多個通道,每個通道的1個或多個數據對象。因此,一幀的尺寸可從小到大,並可能會發生變化幀到幀。

DataTurbine來源介紹DataTurbine來源(也叫做'on-ramp')是一個程序,需要從一個目標(例如傳感器或文件)的數據,並把它成DataTurbine服務器。一位知情人士獨立作為一個單獨的應用程序從服務器上運行,並使用網絡溝通。它可以運行在同一台機器作為服務器或世界各地。每個源可包含多個通道,每個通道有它自己的數據類型。它控制它自己的服務器端的內存和硬盤空間的分配解剖的來源

    
名稱:標識源
    
目標服務器:該服務器​​將數據發送到源
    
高速緩存大小:每個源指定多少幀的數據緩衝本身在服務器的內存(RAM)。
    
存檔大小:每個源指定多少幀的數據存儲在服務器的硬盤驅動器。
    
多通道數據流包含一種類型的數據(例如,數字或視頻)。

    
反過來,每個通道都包括一個:
        
名稱:標識特定通道
        
MIME類型:媒體類型的應用程序可以使用他們所接收的數據作出決策。每個通道都可以只存儲一種類型的數據。
        
數據系列的數據點,包括時間和價值實際的例子:例如,讓我們想像一個簡單測量溫度和濕度的氣象塔,在一個小山頂上。附近是一個場站,這也是測量溫度。我們想要得到這個數據DataTurbine在筆記本電腦上的場站。讓我們走了過來,我們會做什麼。假設我們有我國儀器儀表的自定義來源。

    
啟動一個DataTurbine服務器上的筆記本電腦(rbnb.jar)
    
啟動筆記本電腦上的針對我們的服務器中讀取數據的氣象塔,並把它放到DataTurbine源。此源將包含兩個通道(溫度和濕度)
    
局部場站的筆記本電腦,讀取和寫入將數據放入服務器上啟動另一個來源。此源將包含一個單通道(溫度)現在,我們的筆記本電腦將有三個獨立運行的輕量級程序。現在我們已經在服務器中的數據,我們現在需要一種方法來訪問它。這將在下一節討論。插件插件是一本專門關於請求類型的數據源。鑑於常規渠道主動推送數據的DT服務器,插件回复響應的數據,通過他們的插件服務器連接請求轉發給它們的下沉。要記住的事情每個通道只能有一個與它相關聯的數據類型。還記得數據不能被備份到服務器加載。對於每個通道數據已被依次輸入,因此對於一個給定的信道,每個數據點具有大於先前已記錄的時間戳的時間戳是。

沒有留言: