其實就一行小小的指令~ 很好用~
這是為啥偶後來都喜歡9.1版的postgresql...
不然centos原生就有8了, 以前run了很多年的8也是很穩定~
沒事找事做很累的T___T....
前提的條件就是先裝好postgreql9.1...(extend的功能9.1才開始有的)
並且灌好postgis的plugin~
在自用的database 開好後~ 依情況~或是需要時再執行一行指令
CREATE EXTENSION postgis;
就馬上擁有postgis好用的功能^_^Y....
話說postgis2.0的好用之處~ 目前是還沒有特別發現~
因為最常用的還是距離運算, 空間包含之類的...這些在1.5版時就蠻齊的了
附上官網說明...
http://postgis.net/docs/postgis_installation.html#create_new_db_extensions
古早做法很麻煩的~ 要執行幾個sql匯入function, object....etc
早期為了怕有問題~ 是先做一個有geom的database起來~
然後再開database templet弄進來~
只是就有時候覺得owner或role的設定覺得不開心一_一a....
2013年12月20日 星期五
2013年12月13日 星期五
Java call command
這個其實也沒有太特別~ 算是Java的基本功能吧O_o...
只不過因為這次的整合, 屬於AP的方式...
所以就靠主流程一直下cmd來進行處理~~
主要就是使用Runtime來處理~
因為有些AP是直接將結果或exception, 直接output到console上, 所以就要好好的讀一下它output的內容~
只是單純的想使用簡潔的Runtim process會出現他沒辦法change 工作路徑過去~
這時候使用ProcessBuilder會比較好...
ProcessBuilder builder = new ProcessBuilder(cmd);
builder.directory(new File(home));
另外Process的waitfor()的使用上會卡住的原因, 好像前提是要把Process相關的inputStream(InputStream, ErrorStream)讀完後, 跟outputStream都關掉後~再執行, 就不會有問題...
不過我也只是要讀讀console上的東西~ 就算了XD...
因為當初就跟其他的lib講好~ 把結果output在console或是console+file裡...統一方式比較好處理...
實作上我也不想真的在java裡面寫很多次的cmd呼叫 , 所以主要是~ 先用java寫出動態的sh/bat檔...裡面依據需要給參數或是好幾行的依序執行內容~ 檔寫好後再去執行一次那個檔就好了...寫檔有個好處啦~ debug或只是測試lib, 重新操作比較方便~
public String runCmd(String home, String cmd) throws Exception {
Process p = Runtime.getRuntime().exec(cmd, null, new File(home));
InputStream is = p.getInputStream();
StringBuffer sb = new StringBuffer();
BufferedReader reader = new BufferedReader(new InputStreamReader(p.getInputStream()));
while (true) {
String tmp = reader.readLine();
if(tmp == null) {
break;
}
if(tmp != null) {
sb.append(tmp+"\n");
}
}
return sb.toString();
}
後來又有在某地方改一個應該是比較正統的寫法~加上p.waitfor()
public String runCmd(String home, String cmd) throws Exception {
Process p = Runtime.getRuntime().exec(cmd, null, new File(home));
InputStream is = p.getInputStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(p.getInputStream()));
while (reader.ready()) {
;
}
reader.close();
reader = new BufferedReader(new InputStreamReader(p.getErrorStream()));
while (reader.ready()) {
;
}
reader.close();
p.getOutputStream().close();
p.waitFor();
return "";
}
只不過因為這次的整合, 屬於AP的方式...
所以就靠主流程一直下cmd來進行處理~~
主要就是使用Runtime來處理~
因為有些AP是直接將結果或exception, 直接output到console上, 所以就要好好的讀一下它output的內容~
只是單純的想使用簡潔的Runtim process會出現他沒辦法change 工作路徑過去~
這時候使用ProcessBuilder會比較好...
ProcessBuilder builder = new ProcessBuilder(cmd);
builder.directory(new File(home));
另外Process的waitfor()的使用上會卡住的原因, 好像前提是要把Process相關的inputStream(InputStream, ErrorStream)讀完後, 跟outputStream都關掉後~再執行, 就不會有問題...
不過我也只是要讀讀console上的東西~ 就算了XD...
因為當初就跟其他的lib講好~ 把結果output在console或是console+file裡...統一方式比較好處理...
實作上我也不想真的在java裡面寫很多次的cmd呼叫 , 所以主要是~ 先用java寫出動態的sh/bat檔...裡面依據需要給參數或是好幾行的依序執行內容~ 檔寫好後再去執行一次那個檔就好了...寫檔有個好處啦~ debug或只是測試lib, 重新操作比較方便~
public String runCmd(String home, String cmd) throws Exception {
Process p = Runtime.getRuntime().exec(cmd, null, new File(home));
InputStream is = p.getInputStream();
StringBuffer sb = new StringBuffer();
BufferedReader reader = new BufferedReader(new InputStreamReader(p.getInputStream()));
while (true) {
String tmp = reader.readLine();
if(tmp == null) {
break;
}
if(tmp != null) {
sb.append(tmp+"\n");
}
}
return sb.toString();
}
後來又有在某地方改一個應該是比較正統的寫法~加上p.waitfor()
public String runCmd(String home, String cmd) throws Exception {
Process p = Runtime.getRuntime().exec(cmd, null, new File(home));
InputStream is = p.getInputStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(p.getInputStream()));
while (reader.ready()) {
;
}
reader.close();
reader = new BufferedReader(new InputStreamReader(p.getErrorStream()));
while (reader.ready()) {
;
}
reader.close();
p.getOutputStream().close();
p.waitFor();
return "";
}
2013年12月10日 星期二
Nagwin
唉~前因後果不提~
總之...原本是要裝nagios, 一個監控主機套件~ 不過原生是在linex上
要裝的機器是windows server
這是他們家官網
http://www.nagios.org/
很妙的是~ 他們對其他平台的支援方式是提供"VM檔"...
不過我自己就已經是VM...要玩VM裡的VM這種遊戲嗎~囧?
所以後來就使用Nagwin...簡單的講~ 就是把nagios加上模擬linex運行環境的方式搞到windows上
https://www.itefix.no/i2/nagwin
這東東~ 我覺得並不是完美安裝包~
(完美安裝包~ 我會說像postgresql, apache httpd那種~ 點兩下就會百分之百成功的)
不過主要的問題發生點我認為在於...帳號權限的處理問題上~
首先這包只有32bit的碼, 不過這個倒不是重點~ 能跑就好了~ 管他的一u一a...
另外安裝時~ 他會要求輸入windows帳號密碼, 推估除了是做為設定成windows服務用的(若有安裝完成, 去windows服務裡可以看到生出四個服務, 帳號就是剛剛打的那個), 不過windows帳號是個神奇的系統, 可以安裝又不見得可以跑...(因為我使用的帳號(非administrator), 是可以安裝該服務, 不過無法啟動喔...不過去服務裡面把user設定改掉後~ 就可以啟動了XD, 暫時還沒有看到其他奇怪的事)
Nagwin有提供帳號密碼登入他的網頁, 密碼變更是使用該指令~ 很可能會發生下了指令~ 但是無法寫入密碼的那個檔(看檔案更新日期應該就很清楚)...所以爛人解法~ 就是找一台權限夠能跑的~ 改好密碼~ 把那個檔案貼去失敗的那邊Orz...
不過nagwin的plugin就沒這麼幸運了...
winrpe (原nagios裡的nrpe,安裝在其他主機上, 提供給主要nagios資訊的agent)
https://www.itefix.no/i2/winrpe
不知道為啥(應該還是帳號權限問題, 因為像log那種都寫不進去, 要手動去改資料夾檔案的權限)~ 裝起來寫成功, 但是沒跑起來失敗就算了(該聽的port沒serivce)...
最慘的是居然跑去改該帳號的其他設定(目前最直接發現的就是他居然鎖了該帳號的遠端桌面服務, 還好最後有找到地方把黑名單清掉(本機安全性原則->使用者權限指派->拒絕遠端連線登入...), 還有其他後遺症還沒有碰到就不知道了Orz...
要說還好在測試機就先病發了一_一...不過測試機登不進去也是造成很大困擾.........
所以就先放生他吧...
安裝方式請參照官網...win安裝一直都沒什麼特別困難的...
https://www.itefix.no/i2/content/nagwin-installation
出事的地方都很奇怪~ 基本上都是系統環境問題...
而監控的設定基本上和nagios是一樣的~從他安裝好的路徑下應該很容易看出linex愛用的目錄資料結構~
改密碼指令是到bin/
htpasswd2 -b /etc/nginx/htpasswd nagiosadmin new-password
測試監控項目的指令check_xxx都放在plugin/下
在設定之前, 可以先在這邊試試check指令有沒有正常符合期待~
測試OK後再寫進設定檔做監控比較好~
以免一直看不出問題在那....
題外話: 那天爬了很久的nagios...然後偶發現偶在facebook上~ 居然出現nagios的專頁推薦!!
總之...原本是要裝nagios, 一個監控主機套件~ 不過原生是在linex上
要裝的機器是windows server
這是他們家官網
http://www.nagios.org/
很妙的是~ 他們對其他平台的支援方式是提供"VM檔"...
不過我自己就已經是VM...要玩VM裡的VM這種遊戲嗎~囧?
所以後來就使用Nagwin...簡單的講~ 就是把nagios加上模擬linex運行環境的方式搞到windows上
https://www.itefix.no/i2/nagwin
這東東~ 我覺得並不是完美安裝包~
(完美安裝包~ 我會說像postgresql, apache httpd那種~ 點兩下就會百分之百成功的)
不過主要的問題發生點我認為在於...帳號權限的處理問題上~
首先這包只有32bit的碼, 不過這個倒不是重點~ 能跑就好了~ 管他的一u一a...
另外安裝時~ 他會要求輸入windows帳號密碼, 推估除了是做為設定成windows服務用的(若有安裝完成, 去windows服務裡可以看到生出四個服務, 帳號就是剛剛打的那個), 不過windows帳號是個神奇的系統, 可以安裝又不見得可以跑...(因為我使用的帳號(非administrator), 是可以安裝該服務, 不過無法啟動喔...不過去服務裡面把user設定改掉後~ 就可以啟動了XD, 暫時還沒有看到其他奇怪的事)
Nagwin有提供帳號密碼登入他的網頁, 密碼變更是使用該指令~ 很可能會發生下了指令~ 但是無法寫入密碼的那個檔(看檔案更新日期應該就很清楚)...所以爛人解法~ 就是找一台權限夠能跑的~ 改好密碼~ 把那個檔案貼去失敗的那邊Orz...
不過nagwin的plugin就沒這麼幸運了...
winrpe (原nagios裡的nrpe,安裝在其他主機上, 提供給主要nagios資訊的agent)
https://www.itefix.no/i2/winrpe
不知道為啥(應該還是帳號權限問題, 因為像log那種都寫不進去, 要手動去改資料夾檔案的權限)~ 裝起來寫成功, 但是沒跑起來失敗就算了(該聽的port沒serivce)...
最慘的是居然跑去改該帳號的其他設定(目前最直接發現的就是他居然鎖了該帳號的遠端桌面服務, 還好最後有找到地方把黑名單清掉(本機安全性原則->使用者權限指派->拒絕遠端連線登入...), 還有其他後遺症還沒有碰到就不知道了Orz...
要說還好在測試機就先病發了一_一...不過測試機登不進去也是造成很大困擾.........
所以就先放生他吧...
安裝方式請參照官網...win安裝一直都沒什麼特別困難的...
https://www.itefix.no/i2/content/nagwin-installation
出事的地方都很奇怪~ 基本上都是系統環境問題...
而監控的設定基本上和nagios是一樣的~從他安裝好的路徑下應該很容易看出linex愛用的目錄資料結構~
改密碼指令是到bin/
htpasswd2 -b /etc/nginx/htpasswd nagiosadmin new-password
測試監控項目的指令check_xxx都放在plugin/下
在設定之前, 可以先在這邊試試check指令有沒有正常符合期待~
測試OK後再寫進設定檔做監控比較好~
以免一直看不出問題在那....
題外話: 那天爬了很久的nagios...然後偶發現偶在facebook上~ 居然出現nagios的專頁推薦!!
centos + ganglia
Ganglia是一套主機資源監控的軟體~
大概就是看看cpu/ram/網路資源使用量降子~
原生是linex系統的東東~
資源監控~偶看大概都不跳出這些作法~
會有一台主要server, 搜集資訊, 提供網頁/AP, 儀板表介面~
另外會有所謂像agent這樣的東西~ 裝在其他監控的主機上...
定期把資訊送到server, 或是由server從agent機器上拉過去...
不過這樣子也合理啦~ 因為監控很多是監控到主機硬體資源的使用部分~
正常主機~ 是不會讓其他主機access硬體資訊的...
ganglia分為三個部分~
ganglia-gmetad --> 資料整合(server裝就好)
ganglia-web --> 提供http介面(server裝就好), 使用的是php的套件, 所以要先裝好httpd
ganglia-gmond --> 資料收集(要監控的機器都要裝)
上面套件安裝好後~ 會變成centos上的service, 所以需要開機啟動的話~ 就自己要再去開啟
web介面是沒有帳號權限的~ 就是裝好後~ 連網址就可以直接看到內容~
重點大概就是注意gmond, cluster的 name 要與gmetad的datasource名稱一致, 這樣才對得上..該名稱也會顯示在頁面上~就當成是server群組的名稱吧~
agent/server 溝通的port default是 8649, 要走別的port, gmetad主機後要加port
gmond裡, 還有可以改的大概就是host.location, 填主機的名字以方便識別
udp_send_channel改一下, ttl=1才會live, host給自己的IP, 其他check一下是否有差別
安裝方式如下:
su
rpm -Uvh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
yum install ganglia ganglia-gmetad ganglia-web ganglia-gmond
#only client without server
#yum install ganglia ganglia-gmond
##if everything OK, ganglia will be installed /etc/ganglia/
## httpd page installed /usr/share/ganglia
vim /etc/ganglia/gmetad.conf
##Update data source name and ip address : data_source "OnSite_QA" 60 ipaddress
##If you define 60 every 60 seconds the data will poll from the ganglia monitoring daemons
data_source "hadoop" 60 192.168.3.xx1 192.168.3.xx2 192.168.3.xx3 192.168.3.xx4
vim /etc/ganglia/gmond.conf
cluster {
name = "hadoop" ##modify
owner = "unspecified"
latlong = "unspecified"
url = "unspecified"
}
host {
location = "host1"
}
udp_send_channel {
#bind_hostname = yes # Highly recommended, soon to be default.
# This option tells gmond to use a source address
# that resolves to the machine's hostname. Without
# this, the metrics may appear to come from any
# interface and the DNS names associated with
# those IPs will be used to create the RRDs.
#mcast_join = 239.2.11.71
port = 8649
host = 192.168.3.xx1
ttl = 1
}
udp_recv_channel {
port = 8649
}
tcp_accept_channel {
port = 8649
}
##start service
#/etc/init.d/gmetad restart
#/etc/init.d/gmond restart
service gmetad restart
service gmond restart <--better
service httpd restart
##change php allow
vim /etc/httpd/conf.d/ganglia.conf
<Location /ganglia>
Order deny,allow
#Deny from all ##modify here
Allow from all ##modify here
Allow from ::1
# Allow from .example.com
</Location>
大概就是看看cpu/ram/網路資源使用量降子~
原生是linex系統的東東~
資源監控~偶看大概都不跳出這些作法~
會有一台主要server, 搜集資訊, 提供網頁/AP, 儀板表介面~
另外會有所謂像agent這樣的東西~ 裝在其他監控的主機上...
定期把資訊送到server, 或是由server從agent機器上拉過去...
不過這樣子也合理啦~ 因為監控很多是監控到主機硬體資源的使用部分~
正常主機~ 是不會讓其他主機access硬體資訊的...
ganglia分為三個部分~
ganglia-gmetad --> 資料整合(server裝就好)
ganglia-web --> 提供http介面(server裝就好), 使用的是php的套件, 所以要先裝好httpd
ganglia-gmond --> 資料收集(要監控的機器都要裝)
上面套件安裝好後~ 會變成centos上的service, 所以需要開機啟動的話~ 就自己要再去開啟
web介面是沒有帳號權限的~ 就是裝好後~ 連網址就可以直接看到內容~
重點大概就是注意gmond, cluster的 name 要與gmetad的datasource名稱一致, 這樣才對得上..該名稱也會顯示在頁面上~就當成是server群組的名稱吧~
agent/server 溝通的port default是 8649, 要走別的port, gmetad主機後要加port
gmond裡, 還有可以改的大概就是host.location, 填主機的名字以方便識別
udp_send_channel改一下, ttl=1才會live, host給自己的IP, 其他check一下是否有差別
安裝方式如下:
su
rpm -Uvh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
yum install ganglia ganglia-gmetad ganglia-web ganglia-gmond
#only client without server
#yum install ganglia ganglia-gmond
##if everything OK, ganglia will be installed /etc/ganglia/
## httpd page installed /usr/share/ganglia
vim /etc/ganglia/gmetad.conf
##Update data source name and ip address : data_source "OnSite_QA" 60 ipaddress
##If you define 60 every 60 seconds the data will poll from the ganglia monitoring daemons
data_source "hadoop" 60 192.168.3.xx1 192.168.3.xx2 192.168.3.xx3 192.168.3.xx4
vim /etc/ganglia/gmond.conf
cluster {
name = "hadoop" ##modify
owner = "unspecified"
latlong = "unspecified"
url = "unspecified"
}
host {
location = "host1"
}
udp_send_channel {
#bind_hostname = yes # Highly recommended, soon to be default.
# This option tells gmond to use a source address
# that resolves to the machine's hostname. Without
# this, the metrics may appear to come from any
# interface and the DNS names associated with
# those IPs will be used to create the RRDs.
#mcast_join = 239.2.11.71
port = 8649
host = 192.168.3.xx1
ttl = 1
}
udp_recv_channel {
port = 8649
}
tcp_accept_channel {
port = 8649
}
##start service
#/etc/init.d/gmetad restart
#/etc/init.d/gmond restart
service gmetad restart
service gmond restart <--better
service httpd restart
##change php allow
vim /etc/httpd/conf.d/ganglia.conf
<Location /ganglia>
Order deny,allow
#Deny from all ##modify here
Allow from all ##modify here
Allow from ::1
# Allow from .example.com
</Location>
tomcat 設定docbase
起因是為了把某資料夾底下的檔案用http存取...
算當file server使用~
話說湯姆貓平常的佈署路徑是在{tomcat_home}/webapps/xxxx下...
雖然說context.xml裡面可以設定docbase...
不過好像有限制...反正就是指不過去...
差別在於file server路徑是在別槽, 另外掛載上去的storege~
就只好更暴力一點~ 改在server.xml裡面下手
在server.xml最底下的host內加入
<Context path="/xxx" docBase="G:/ftp/xxx" debug="0" reloadable="false" crossContext="true" />
<Host> <----xml最後, 加在Host內
重起tomcat後~ 就可以了~
使用的連結大概會像降子...
http://{ip:port}/xxx/abc.jpg --> (G:/ftp/xxx/abc.jpg)
算當file server使用~
話說湯姆貓平常的佈署路徑是在{tomcat_home}/webapps/xxxx下...
雖然說context.xml裡面可以設定docbase...
不過好像有限制...反正就是指不過去...
差別在於file server路徑是在別槽, 另外掛載上去的storege~
就只好更暴力一點~ 改在server.xml裡面下手
在server.xml最底下的host內加入
<Context path="/xxx" docBase="G:/ftp/xxx" debug="0" reloadable="false" crossContext="true" />
<Host> <----xml最後, 加在Host內
重起tomcat後~ 就可以了~
使用的連結大概會像降子...
http://{ip:port}/xxx/abc.jpg --> (G:/ftp/xxx/abc.jpg)
win + apache proxy
在windows上裝apache 相信是很common的討論~
google大神不用認真拜就很多了~~
以往為了技術單純化~ 都用一隻湯姆貓解決比較簡單~
不過因為這次要整合四五個來源的http服務, 再統一從對外網站對外提供...
所以還是靠個有口碑的proxy比較輕鬆~~
(不然到時候一堆人又會吵是誰誰誰沒轉好~)
對於網頁proxy 的作法~
apache有提供蠻多的方式...
ex: 設定指定目錄, virtual host...etc, 在tomcat整合還有ajp等方式~~~
因為windows安裝完的設定與centos上不一樣~
(centos上懶人安裝, 功能較多)
仔細看~ 會發現httpd裡面的模組就有差~
在LoadModule 裡面, windows的vhost(virtual host), ajp預設都是沒有開啟~
而且可能也沒有相關lib..要另外download
所以為了簡單...所以直接採用單純的proxy...
在centos好像會自己弄成virtual host...
virtual host是比較高級~ 可以再設定限制連線~
不過我在這邊用已經是最外層對外服務的web了~
所以直接用轉導倒也是OK~
原則上基本就是...
1. 到官網下載安裝包
http://httpd.apache.org/download.cgi
2. 點兩下安裝...
3. 改改設定檔
httpd.conf 修改
Listen 80 <---改成需要的port
啟用module
LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_http_module modules/mod_proxy_http.so
4. 設定簡易proxy
httpd.conf 拉到最底下直接加上轉導路徑
ProxyPass /tomcat/ http://localhost:8080/
ProxyPassReverse /tomcat/ http://localhost:8080/
5. 啟動(重啟) apache服務
6. 測試
google大神不用認真拜就很多了~~
以往為了技術單純化~ 都用一隻湯姆貓解決比較簡單~
不過因為這次要整合四五個來源的http服務, 再統一從對外網站對外提供...
所以還是靠個有口碑的proxy比較輕鬆~~
(不然到時候一堆人又會吵是誰誰誰沒轉好~)
對於網頁proxy 的作法~
apache有提供蠻多的方式...
ex: 設定指定目錄, virtual host...etc, 在tomcat整合還有ajp等方式~~~
因為windows安裝完的設定與centos上不一樣~
(centos上懶人安裝, 功能較多)
仔細看~ 會發現httpd裡面的模組就有差~
在LoadModule 裡面, windows的vhost(virtual host), ajp預設都是沒有開啟~
而且可能也沒有相關lib..要另外download
所以為了簡單...所以直接採用單純的proxy...
在centos好像會自己弄成virtual host...
virtual host是比較高級~ 可以再設定限制連線~
不過我在這邊用已經是最外層對外服務的web了~
所以直接用轉導倒也是OK~
原則上基本就是...
1. 到官網下載安裝包
http://httpd.apache.org/download.cgi
2. 點兩下安裝...
3. 改改設定檔
httpd.conf 修改
Listen 80 <---改成需要的port
啟用module
LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_http_module modules/mod_proxy_http.so
4. 設定簡易proxy
httpd.conf 拉到最底下直接加上轉導路徑
ProxyPass /tomcat/ http://localhost:8080/
ProxyPassReverse /tomcat/ http://localhost:8080/
5. 啟動(重啟) apache服務
6. 測試
2013年12月4日 星期三
java wav 過濾頻率
這邊主要是將原wav過濾頻率, 只保留部分的音頻...
相信在google上也有很多的實作~
不過有找到一個方便的lib...
只要一個jar, 下個cmd就OK了~
使用dsp-collection.jar
出處:
http://www.source-code.biz/dsp/java/
用法很簡單~ 只要下cmd就好了
雖然偶不懂用那個演算法比較好XD..個人指導最高原則就是...能動就好了
ex:
java -cp dsp-collection.jar WavFilterFisher sample.wav bandpass chebyshev 4 -0.5 1000 5000 out.wav
來個檔案做圖解~ 雖然沒辦法濾得很乾淨, 但是看起來還是有差的+_+
原始檔頻譜
過濾後頻譜
相信在google上也有很多的實作~
不過有找到一個方便的lib...
只要一個jar, 下個cmd就OK了~
使用dsp-collection.jar
出處:
http://www.source-code.biz/dsp/java/
用法很簡單~ 只要下cmd就好了
雖然偶不懂用那個演算法比較好XD..個人指導最高原則就是...能動就好了
ex:
java -cp dsp-collection.jar WavFilterFisher sample.wav bandpass chebyshev 4 -0.5 1000 5000 out.wav
來個檔案做圖解~ 雖然沒辦法濾得很乾淨, 但是看起來還是有差的+_+
原始檔頻譜
過濾後頻譜
java wav 轉 mp3
wav檔處理是OK的~ 不過因為檔案很大~ 不太適合做資訊發布...
所以最後發布就轉成mp3壓縮~
我這邊採用jave的lib...
http://www.sauronsoftware.it/projects/jave/
原因最主要也是輕量(只要一個jar)
win/centos通吃
可java呼叫
可以成功使用(重點)
呼叫的java sample code如下:
File source = new File(fin);
File target = new File(fout);
AudioAttributes audio = new AudioAttributes();
audio.setCodec("libmp3lame");
audio.setBitRate(new Integer(128000));
audio.setChannels(new Integer(2));
audio.setSamplingRate(new Integer(44100));
EncodingAttributes attrs = new EncodingAttributes();
attrs.setFormat("mp3");
attrs.setAudioAttributes(audio);
Encoder encoder = new Encoder();
encoder.encode(source, target, attrs);
所以最後發布就轉成mp3壓縮~
我這邊採用jave的lib...
http://www.sauronsoftware.it/projects/jave/
原因最主要也是輕量(只要一個jar)
win/centos通吃
可java呼叫
可以成功使用(重點)
呼叫的java sample code如下:
File source = new File(fin);
File target = new File(fout);
AudioAttributes audio = new AudioAttributes();
audio.setCodec("libmp3lame");
audio.setBitRate(new Integer(128000));
audio.setChannels(new Integer(2));
audio.setSamplingRate(new Integer(44100));
EncodingAttributes attrs = new EncodingAttributes();
attrs.setFormat("mp3");
attrs.setAudioAttributes(audio);
Encoder encoder = new Encoder();
encoder.encode(source, target, attrs);
java wav 計算db值
承上篇...
有計算音量的部分(算db值, 只是一個強度相對的大小值)
我忘了是從那邊copy出來的~
不過用google查該method name應該可以找到不少...
反正就把從WaveFile裡面讀到的raw data丟進去就有一個值生出來了XD"
主要計算db的method
/** Computes the RMS volume of a group of signal sizes ranging from -1 to 1. */
public double volumeRMS(double[] raw) {
double sum = 0d;
if (raw.length==0) {
return sum;
} else {
for (int ii=0; ii<raw.length; ii++) {
sum += raw[ii];
}
}
double average = sum/raw.length;
double sumMeanSquare = 0d;
for (int ii=0; ii<raw.length; ii++) {
sumMeanSquare += Math.pow(raw[ii]-average,2d);
}
double averageMeanSquare = sumMeanSquare/raw.length;
double rootMeanSquare = Math.pow(averageMeanSquare,0.5d);
return rootMeanSquare;
}
有計算音量的部分(算db值, 只是一個強度相對的大小值)
我忘了是從那邊copy出來的~
不過用google查該method name應該可以找到不少...
反正就把從WaveFile裡面讀到的raw data丟進去就有一個值生出來了XD"
主要計算db的method
/** Computes the RMS volume of a group of signal sizes ranging from -1 to 1. */
public double volumeRMS(double[] raw) {
double sum = 0d;
if (raw.length==0) {
return sum;
} else {
for (int ii=0; ii<raw.length; ii++) {
sum += raw[ii];
}
}
double average = sum/raw.length;
double sumMeanSquare = 0d;
for (int ii=0; ii<raw.length; ii++) {
sumMeanSquare += Math.pow(raw[ii]-average,2d);
}
double averageMeanSquare = sumMeanSquare/raw.length;
double rootMeanSquare = Math.pow(averageMeanSquare,0.5d);
return rootMeanSquare;
}
java WavFile 切檔案
處理wav檔資訊, 問google其實有不少的解決方案~
這次對於聲音採取lib的選擇順序重點是...
1. 要能動, 可以包成api呼叫
2. win/linex(centos) 都要相容
3. 以java原生/相關優先
4. 輕量/相依性較低
因為其實也不是只做一兩項處理而已~ ...每個人都說很簡單~
喵的~ 最好整四五個以上功能很簡單啦....
所以有很多有名的都不行T_T...
有名的大多都包成有UI了~ 很難從裡面拆method出來用@@...
我最後是採用一個有source檔的拿來回自己改...這個只能處理純wav...
有些運算可能不太正確~ 其原因為我並不了解wav本身的正確格式與內容~
不過就我需要的部分~ 能做到就好^^0....
選他是因為lib最單純(純java code, 不用lib/jar)~
然後有是有少數看懂source code用法|||Orz...
(某些lib/open source是看懂一半~ 硬改source code後就GG...因為套件流程綁太多~ 沒辦法一一找對地方改QQ)
先說明該source code的出處
http://www.labbookpages.co.uk/audio/javaWavFiles.html
至於應用上是拿來做兩件事
1. 處理切檔(ex: 2分鐘切出1分鐘)
2. 找音量相對大的位置(音量就等於內容資料強度...反正就是值越大, 音量越大)
合起來就變成~ 找出音量相對大的地方開始切N秒~ 成為新的wav檔案
這篇先放切檔案的method
基本上就是先建一個空的{secs}秒, wav, 然後從原本的wav檔, 某個位置開始讀寫...
有處理若是該位置是在原始檔的後面(會有可能造成只寫一部分的資料~ 但是新檔尾巴就空空的)
在原始檔長度小於新檔長度時就會丟excption出去了...(在偶低case裡, 原始檔長度不夠本來就不應該做處理)
/**cut wav file, start with buffersize * start, cut secs seconds
* @param wavFile source
* @param buffersize sample buffersize
* @param start idx of the sample
* @param secs cut how long seconds
* @param target output file path*/
public void process(WavFile wavFile, int buffersize, int start, double secs, String target) throws Exception {
WavFile newFile = null;
try {
long sampleRate = wavFile.getSampleRate(); // Samples per second
double totalsecs = wavFile.getNumFrames()/buffersize;
if(start+secs > totalsecs) {
start = (int)(totalsecs - secs);
}
if(start < 0) {
start = 0;
}
double duration = secs; // Seconds
// Calculate the number of frames required for specified duration
long numFrames = (long)(duration * sampleRate);
newFile = WavFile.newWavFile(new File(target),
wavFile.getNumChannels(),
numFrames,
wavFile.getValidBits(),
sampleRate);
// Display information about the wav file
//wavFile.display();
//System.out.println("-----------");
//newFile.display();
// Get the number of audio channels in the wav file
int numChannels = wavFile.getNumChannels();
int frames = buffersize;
// Create a buffer of 100 frames
double[][] buffer = new double[numChannels][frames];
int framesRead;
int loop = 0;
// Read frames into buffer
do {
framesRead = wavFile.readFrames(buffer, frames);
if(loop >= start) {
long remaining = newFile.getFramesRemaining();
int toWrite = (remaining > frames) ? frames : (int) remaining;
// Write the buffer
//newFile.writeFrames(buffer, toWrite);
newFile.writeFrames(buffer, toWrite);
if(toWrite < frames) {
break;
}
}
loop++;
} while (framesRead != 0) ;
} catch(Exception e) {
throw e;
} finally {
if(newFile != null) {
try { newFile.close(); } catch(Exception eee) {}
}
}
}
這次對於聲音採取lib的選擇順序重點是...
1. 要能動, 可以包成api呼叫
2. win/linex(centos) 都要相容
3. 以java原生/相關優先
4. 輕量/相依性較低
因為其實也不是只做一兩項處理而已~ ...每個人都說很簡單~
喵的~ 最好整四五個以上功能很簡單啦....
所以有很多有名的都不行T_T...
有名的大多都包成有UI了~ 很難從裡面拆method出來用@@...
我最後是採用一個有source檔的拿來回自己改...這個只能處理純wav...
有些運算可能不太正確~ 其原因為我並不了解wav本身的正確格式與內容~
不過就我需要的部分~ 能做到就好^^0....
選他是因為lib最單純(純java code, 不用lib/jar)~
然後有是有少數看懂source code用法|||Orz...
(某些lib/open source是看懂一半~ 硬改source code後就GG...因為套件流程綁太多~ 沒辦法一一找對地方改QQ)
先說明該source code的出處
http://www.labbookpages.co.uk/audio/javaWavFiles.html
至於應用上是拿來做兩件事
1. 處理切檔(ex: 2分鐘切出1分鐘)
2. 找音量相對大的位置(音量就等於內容資料強度...反正就是值越大, 音量越大)
合起來就變成~ 找出音量相對大的地方開始切N秒~ 成為新的wav檔案
這篇先放切檔案的method
基本上就是先建一個空的{secs}秒, wav, 然後從原本的wav檔, 某個位置開始讀寫...
有處理若是該位置是在原始檔的後面(會有可能造成只寫一部分的資料~ 但是新檔尾巴就空空的)
在原始檔長度小於新檔長度時就會丟excption出去了...(在偶低case裡, 原始檔長度不夠本來就不應該做處理)
/**cut wav file, start with buffersize * start, cut secs seconds
* @param wavFile source
* @param buffersize sample buffersize
* @param start idx of the sample
* @param secs cut how long seconds
* @param target output file path*/
public void process(WavFile wavFile, int buffersize, int start, double secs, String target) throws Exception {
WavFile newFile = null;
try {
long sampleRate = wavFile.getSampleRate(); // Samples per second
double totalsecs = wavFile.getNumFrames()/buffersize;
if(start+secs > totalsecs) {
start = (int)(totalsecs - secs);
}
if(start < 0) {
start = 0;
}
double duration = secs; // Seconds
// Calculate the number of frames required for specified duration
long numFrames = (long)(duration * sampleRate);
newFile = WavFile.newWavFile(new File(target),
wavFile.getNumChannels(),
numFrames,
wavFile.getValidBits(),
sampleRate);
// Display information about the wav file
//wavFile.display();
//System.out.println("-----------");
//newFile.display();
// Get the number of audio channels in the wav file
int numChannels = wavFile.getNumChannels();
int frames = buffersize;
// Create a buffer of 100 frames
double[][] buffer = new double[numChannels][frames];
int framesRead;
int loop = 0;
// Read frames into buffer
do {
framesRead = wavFile.readFrames(buffer, frames);
if(loop >= start) {
long remaining = newFile.getFramesRemaining();
int toWrite = (remaining > frames) ? frames : (int) remaining;
// Write the buffer
//newFile.writeFrames(buffer, toWrite);
newFile.writeFrames(buffer, toWrite);
if(toWrite < frames) {
break;
}
}
loop++;
} while (framesRead != 0) ;
} catch(Exception e) {
throw e;
} finally {
if(newFile != null) {
try { newFile.close(); } catch(Exception eee) {}
}
}
}
2013年10月29日 星期二
wav spectrum 頻譜圖 + python audiolab
因為這次要處理wav檔案資料...
其實py audiolab能處理上不止算頻譜,
只是其他的項目用Java解決掉~
就只有頻譜圖是用python...
(一來不會算頻譜, 二來Java繪圖的套件過於艱澀, py找來的資源就已經相當漂亮了Q_Q)
環境安裝貼在上一篇文...
先貼一張結果圖~
程式沒幾行, 過程是挺心酸的...Q_Q...
整個重點大概在於:
python 與相關應用module安裝...
找一段可以用的程式碼來打底...
(google audiolab+spectrum 應該就會有~我也是google來的)
調好圖表版面, 對齊(因有上下圖需要對齊的需求), 間隔, 顏色
(sample是2分鐘的wav, 但因實務上只使用大約30s, 所以才用5s做單位x軸)
解決中文字的問題....
(python原生沒中文, 最後採用指定TTF解決, 跨平台機器安裝才方便Q_Q)
原始碼
#!/usr/bin/env python
#-*- coding: utf-8 -*-
import sys
from pylab import *
import wave
import matplotlib.pyplot as plt
myfont = matplotlib.font_manager.FontProperties(fname='/etc/xxxx/MSJH.TTF')
infile = None
outfile = None
secInterval = 5 #x-axis interval
def show_wave_n_spec():
if(len(sys.argv) < 2) :
print('err input, plz input argv[1] and argv[2]')
return
infile = sys.argv[1]
outfile = sys.argv[2]
spf = wave.open(infile,'r')
f = spf.getframerate()
sound_info = spf.readframes(-1)
sound_info = fromstring(sound_info, 'Int16')
spflength = round(spf.getnframes()/f)
#print(spflength)
#print(sound_info)
#---------------pic1
ax = subplot(211)
title(u'Waveform(波形圖)',fontproperties=myfont)
plot(sound_info, '#5755FF')
xlim(0, len(sound_info))
plt.axis([0, len(sound_info), -15000, 15000])
grid(True)
#change axis-x position/text
xlabel2 = range(0, int(spflength+1), secInterval)
xposition2 = []
for x in xlabel2:
xposition2.append(x*f)
#print(xposition2)
ax.set_xticks(xposition2)
ax.set_xticklabels(xlabel2)
#change axis-y
ylabel2 = []
for x in ax.get_yticks() :
ylabel2.append(str(x/1000)+'k')
ax.set_yticklabels(ylabel2)
#---------------pic2
ax2 = subplot(212)
title(u'Spectrogram(頻譜圖)',fontproperties=myfont)
spectrogram = specgram(sound_info, Fs = f, scale_by_freq=True,sides='default')
plt.axis([0, spflength, 0, 20000])
ax2.set_xticks(xlabel2)
ax2.set_xticklabels(xlabel2)
grid(True)
ylabel2 = []
for x in ax2.get_yticks() :
ylabel2.append(str(x/1000)+'k')
ax2.set_yticklabels(ylabel2)
#show()
savefig(outfile);
spf.close()
show_wave_n_spec()
其實py audiolab能處理上不止算頻譜,
只是其他的項目用Java解決掉~
就只有頻譜圖是用python...
(一來不會算頻譜, 二來Java繪圖的套件過於艱澀, py找來的資源就已經相當漂亮了Q_Q)
環境安裝貼在上一篇文...
先貼一張結果圖~
程式沒幾行, 過程是挺心酸的...Q_Q...
整個重點大概在於:
python 與相關應用module安裝...
找一段可以用的程式碼來打底...
(google audiolab+spectrum 應該就會有~我也是google來的)
調好圖表版面, 對齊(因有上下圖需要對齊的需求), 間隔, 顏色
(sample是2分鐘的wav, 但因實務上只使用大約30s, 所以才用5s做單位x軸)
解決中文字的問題....
(python原生沒中文, 最後採用指定TTF解決, 跨平台機器安裝才方便Q_Q)
原始碼
#!/usr/bin/env python
#-*- coding: utf-8 -*-
import sys
from pylab import *
import wave
import matplotlib.pyplot as plt
myfont = matplotlib.font_manager.FontProperties(fname='/etc/xxxx/MSJH.TTF')
infile = None
outfile = None
secInterval = 5 #x-axis interval
def show_wave_n_spec():
if(len(sys.argv) < 2) :
print('err input, plz input argv[1] and argv[2]')
return
infile = sys.argv[1]
outfile = sys.argv[2]
spf = wave.open(infile,'r')
f = spf.getframerate()
sound_info = spf.readframes(-1)
sound_info = fromstring(sound_info, 'Int16')
spflength = round(spf.getnframes()/f)
#print(spflength)
#print(sound_info)
#---------------pic1
ax = subplot(211)
title(u'Waveform(波形圖)',fontproperties=myfont)
plot(sound_info, '#5755FF')
xlim(0, len(sound_info))
plt.axis([0, len(sound_info), -15000, 15000])
grid(True)
#change axis-x position/text
xlabel2 = range(0, int(spflength+1), secInterval)
xposition2 = []
for x in xlabel2:
xposition2.append(x*f)
#print(xposition2)
ax.set_xticks(xposition2)
ax.set_xticklabels(xlabel2)
#change axis-y
ylabel2 = []
for x in ax.get_yticks() :
ylabel2.append(str(x/1000)+'k')
ax.set_yticklabels(ylabel2)
#---------------pic2
ax2 = subplot(212)
title(u'Spectrogram(頻譜圖)',fontproperties=myfont)
spectrogram = specgram(sound_info, Fs = f, scale_by_freq=True,sides='default')
plt.axis([0, spflength, 0, 20000])
ax2.set_xticks(xlabel2)
ax2.set_xticklabels(xlabel2)
grid(True)
ylabel2 = []
for x in ax2.get_yticks() :
ylabel2.append(str(x/1000)+'k')
ax2.set_yticklabels(ylabel2)
#show()
savefig(outfile);
spf.close()
show_wave_n_spec()
centos 6.3 + python module scikits.audiolab
原本是想用Java處理就好~ 不過卡在頻譜圖一直畫不出來...
(叫一個工數只有20~40分的程度人來做真是太勉強了Orz...)
最後只好妥協用python的套件...
先講快樂windows安裝包(比起centos真的快樂很多, 前提是找得到)...
選用2.6版本~ 是因為當初找得比較齊的~
還好centos內建也是2.6, 至少轉移上去程式可以不用改...
windows快樂包~ 沒記錯的話應該是(依序安裝)
python-2.6.6rc2.amd64.msi
numpy-MKL-1.7.1.win-amd64-py2.6.exe
matplotlib-1.2.1.win-amd64-py2.6.exe
scikits.audiolab-0.10.2.win32-py2.6.exe
接著就是讓我非常意外的...centos安裝...
明明已經是內建python的環境, 怎麼會裝得那麼淒慘...
真的不知道要怎麼說...以下是到處解決xxx package not found後的統整結果~
可能裡面有多裝了一些什麼碗糕的~ 反正重點就是...能一路裝完就好了T___T...
至於重點應該還是在於
裝好gcc
裝好numpy module(1.6.2)
裝好matplotlib(1.1.1) <--原本想裝最新版的, 看到前置清單放棄...還好舊版的會動...無言
裝好audiolab(0.11.0)
只是為了裝好其中一個模組~ 大概就是要一直google, 試裝N次...Orz...
安裝像debug一樣~ 真是好樣的open source...喵的
su
yum install gcc
su
cd /etc/xxx/install
wget --no-check-certificate http://pypi.python.org/packages/source/d/distribute/distribute-0.6.27.tar.gz
tar -xvf distribute-0.6.27.tar.gz
cd distribute-0.6.27
python setup.py build
python setup.py install
yum install gcc-gfortran
yum install blas-devel
yum install lapack-devel
yum install python-dev python-devel
yum install gcc-c++
yum install libpng-devel
##install pip
#yum -y install python-setuptools
easy_install pip
pip install pil
##install numpy
cd /etc/xxx/install
#wget --no-check-certificate https://pypi.python.org/packages/source/n/numpy/numpy-1.7.1.tar.gz#md5=0ab72b3b83528a7ae79c6df9042d61c6
wget http://downloads.sourceforge.net/project/numpy/NumPy/1.6.2/numpy-1.6.2.tar.gz
tar -xvf numpy-1.6.2.tar.gz
cd numpy-1.6.2
python setup.py build
python setup.py install
##
wget http://downloads.sourceforge.net/project/scipy/scipy/0.11.0/scipy-0.11.0.tar.gz
tar -xzf scipy-0.11.0.tar.gz
cd scipy-0.11.0
python setup.py build
python setup.py install
##install freetype
cd /etc/xxx/install
wget http://sourceforge.net/projects/freetype/files/freetype2/2.5.0/freetype-2.5.0.1.tar.gz/download
tar -xzf freetype-2.5.0.1.tar.gz
cd freetype-2.5.0.1
./configure --without-png
make
make install
##install matplotlib(use 1.1.1 OK!! 1.3 fail...Orz)
#yum install python-matplotlib
wget http://downloads.sourceforge.net/project/matplotlib/matplotlib/matplotlib-1.1.1/matplotlib-1.1.1.tar.gz?r=http%3A%2F%2Fsourceforge.net%2Fprojects%2Fmatplotlib%2Ffiles%2Fmatplotlib%2Fmatplotlib-1.1.1%2F&ts=1351382082&use_mirror=jaist
tar -xvf matplotlib-1.1.1.tar.gz
cd matplotlib-1.1.1
python setup.py build
python setup.py install
##install wave module
wget http://www.mega-nerd.com/libsndfile/files/libsndfile-1.0.25.tar.gz
tar -xzf libsndfile-1.0.25.tar.gz
cd libsndfile-1.0.25
./configure
make -j8
make -j8 install
wget --no-check-certificate https://pypi.python.org/packages/source/s/scikits.audiolab/scikits.audiolab-0.11.0.tar.gz#md5=f93f17211c7763d8631e0d10f37471b0
tar -xzf scikits.audiolab-0.11.0.tar.gz
cd scikits.audiolab-0.11.0
python setup.py build
python setup.py install
(叫一個工數只有20~40分的程度人來做真是太勉強了Orz...)
最後只好妥協用python的套件...
先講快樂windows安裝包(比起centos真的快樂很多, 前提是找得到)...
選用2.6版本~ 是因為當初找得比較齊的~
還好centos內建也是2.6, 至少轉移上去程式可以不用改...
windows快樂包~ 沒記錯的話應該是(依序安裝)
python-2.6.6rc2.amd64.msi
numpy-MKL-1.7.1.win-amd64-py2.6.exe
matplotlib-1.2.1.win-amd64-py2.6.exe
scikits.audiolab-0.10.2.win32-py2.6.exe
接著就是讓我非常意外的...centos安裝...
明明已經是內建python的環境, 怎麼會裝得那麼淒慘...
真的不知道要怎麼說...以下是到處解決xxx package not found後的統整結果~
可能裡面有多裝了一些什麼碗糕的~ 反正重點就是...能一路裝完就好了T___T...
至於重點應該還是在於
裝好gcc
裝好numpy module(1.6.2)
裝好matplotlib(1.1.1) <--原本想裝最新版的, 看到前置清單放棄...還好舊版的會動...無言
裝好audiolab(0.11.0)
只是為了裝好其中一個模組~ 大概就是要一直google, 試裝N次...Orz...
安裝像debug一樣~ 真是好樣的open source...喵的
su
yum install gcc
su
cd /etc/xxx/install
wget --no-check-certificate http://pypi.python.org/packages/source/d/distribute/distribute-0.6.27.tar.gz
tar -xvf distribute-0.6.27.tar.gz
cd distribute-0.6.27
python setup.py build
python setup.py install
yum install gcc-gfortran
yum install blas-devel
yum install lapack-devel
yum install python-dev python-devel
yum install gcc-c++
yum install libpng-devel
##install pip
#yum -y install python-setuptools
easy_install pip
pip install pil
##install numpy
cd /etc/xxx/install
#wget --no-check-certificate https://pypi.python.org/packages/source/n/numpy/numpy-1.7.1.tar.gz#md5=0ab72b3b83528a7ae79c6df9042d61c6
wget http://downloads.sourceforge.net/project/numpy/NumPy/1.6.2/numpy-1.6.2.tar.gz
tar -xvf numpy-1.6.2.tar.gz
cd numpy-1.6.2
python setup.py build
python setup.py install
##
wget http://downloads.sourceforge.net/project/scipy/scipy/0.11.0/scipy-0.11.0.tar.gz
tar -xzf scipy-0.11.0.tar.gz
cd scipy-0.11.0
python setup.py build
python setup.py install
##install freetype
cd /etc/xxx/install
wget http://sourceforge.net/projects/freetype/files/freetype2/2.5.0/freetype-2.5.0.1.tar.gz/download
tar -xzf freetype-2.5.0.1.tar.gz
cd freetype-2.5.0.1
./configure --without-png
make
make install
##install matplotlib(use 1.1.1 OK!! 1.3 fail...Orz)
#yum install python-matplotlib
wget http://downloads.sourceforge.net/project/matplotlib/matplotlib/matplotlib-1.1.1/matplotlib-1.1.1.tar.gz?r=http%3A%2F%2Fsourceforge.net%2Fprojects%2Fmatplotlib%2Ffiles%2Fmatplotlib%2Fmatplotlib-1.1.1%2F&ts=1351382082&use_mirror=jaist
tar -xvf matplotlib-1.1.1.tar.gz
cd matplotlib-1.1.1
python setup.py build
python setup.py install
##install wave module
wget http://www.mega-nerd.com/libsndfile/files/libsndfile-1.0.25.tar.gz
tar -xzf libsndfile-1.0.25.tar.gz
cd libsndfile-1.0.25
./configure
make -j8
make -j8 install
wget --no-check-certificate https://pypi.python.org/packages/source/s/scikits.audiolab/scikits.audiolab-0.11.0.tar.gz#md5=f93f17211c7763d8631e0d10f37471b0
tar -xzf scikits.audiolab-0.11.0.tar.gz
cd scikits.audiolab-0.11.0
python setup.py build
python setup.py install
centos 6.3 + apache + php + tomcat
起因1是...Java系統都是tomcat...然後centos吵不是root不能bind 80....
我又不想用root跑tomcat...所以就只好加apache...
因2是...postgresql管理工具, 還是phpPgAdmin比較方便...
反正php都是活在apache下
因3是...因為網管議題乾脆都走http ws比較單純...
所以有東西會從內網proxy->DMZproxy->外網proxy
(喵的~ 我也不想搞成這樣)
因4是...apache(httpd)裝起來好快阿阿阿~~~
因5是...我已經在很努力的精簡過多的功能維護設定...
(之後維護的人接不起來...雖小的還是我~_~)
總之~ 基本架構就是apache+php+tomcat....
依不同的主機跟功能~ 會有兩兩配的情況, 還有os(centos, win server)配...
連安裝建置也要客制化真是要死了
(所以今年決定放棄之前比較熟悉ubuntu陣營了...主因是看到centos的桌面比ubuntu好很多...)
總之底下是php(以phpPgAdmin)安裝法...centos和win安裝差蠻多的~
su
##只要一行就裝完apache, 給個讚
yum -y install httpd
#change apache listen port
vim /etc/httpd/conf/httpd.conf
Listen 80 ( other port--> Listen *:9080)
/etc/rc.d/init.d/httpd start
###(if perssion deney check this)
getenforce
-->Enforcing
setenforce 0
getenforce
-->Permissive
service httpd start
#check log to see httpd is OK?
ps -ef | grep http
cat /var/log/httpd/error_log
##安裝PHP, 也是一行...好!
yum -y install php
##php test page
cd /var/www/html/
vim index.php
<?php
phpinfo();
?>
##restart httpd
service httpd restart
##test php
xxxx/index.php
##安裝phpPgAdmin...真好~
##不過有發生過yum找不到package...不知道是剛好被關掉還那邊網路檔掉??
yum install phpPgAdmin
#check conf, 預設就只開localhost, 不錯有sense...一般應用就是看情況再多加Allow
cat /etc/httpd/conf.d/phpPgAdmin.conf
<Location /phpPgAdmin>
Order deny,allow
Deny from all
Allow from 127.0.0.1
Allow from ::1
# Allow from .example.com
</Location>
cat /etc/php.ini
#如果不知道被東西被裝到那邊去~ search file一下
#check where? (test ok it is /usr/share/phpPgAdmin)
find / -name phpPgAdmin
exit;
### 安裝湯姆貓, 個人習慣使用zip安裝, yum資料夾結構亂跑很難處理
### 使用一般user, 裝在預設8080就好
mkdir tomcat
cd tomcat
cp ~/apache-tomcat-7.0.26.tar.gz .
tar -zxv -f apache-tomcat-7.0.26.tar.gz
cd apache-tomcat-7.0.26
bin/startup.sh
## open browser link xxxx:8080 to test
### more setting
vim bin/catalina.sh
add
JAVA_OPTS="-Xms128m -Xmx1024m"
### use 80(apache2 proxy, ajp)
su
vim /etc/httpd/conf/httpd.conf
##(G)加在最底下,
##不要用/根目錄 轉導~ 會導致php或其他apache目錄被搶走~
##依需要使用對應之路徑就好
<VirtualHost *:80>
ServerName mybestappp.com
ServerAlias www.mybestapp.com
ProxyRequests Off
ProxyPreserveHost On
ErrorLog /etc/xxx/log/httpd.tomcat.error.log
CustomLog /etc/xxx/log/httpd.tomcat.log combined
<Proxy *>
Order deny,allow
Allow from all
</Proxy>
ProxyPass /tomcat ajp://localhost:8009/
ProxyPassReverse /tomcat ajp://localhost:8009/
ProxyPass /mywebapp ajp://localhost:8009/mywebapp
ProxyPassReverse /mywebapp ajp://localhost:8009/mywebapp
</VirtualHost>
service httpd restart
我又不想用root跑tomcat...所以就只好加apache...
因2是...postgresql管理工具, 還是phpPgAdmin比較方便...
反正php都是活在apache下
因3是...因為網管議題乾脆都走http ws比較單純...
所以有東西會從內網proxy->DMZproxy->外網proxy
(喵的~ 我也不想搞成這樣)
因4是...apache(httpd)裝起來好快阿阿阿~~~
因5是...我已經在很努力的精簡過多的功能維護設定...
(之後維護的人接不起來...雖小的還是我~_~)
總之~ 基本架構就是apache+php+tomcat....
依不同的主機跟功能~ 會有兩兩配的情況, 還有os(centos, win server)配...
連安裝建置也要客制化真是要死了
(所以今年決定放棄之前比較熟悉ubuntu陣營了...主因是看到centos的桌面比ubuntu好很多...)
總之底下是php(以phpPgAdmin)安裝法...centos和win安裝差蠻多的~
su
##只要一行就裝完apache, 給個讚
yum -y install httpd
#change apache listen port
vim /etc/httpd/conf/httpd.conf
Listen 80 ( other port--> Listen *:9080)
/etc/rc.d/init.d/httpd start
###(if perssion deney check this)
getenforce
-->Enforcing
setenforce 0
getenforce
-->Permissive
service httpd start
#check log to see httpd is OK?
ps -ef | grep http
cat /var/log/httpd/error_log
##安裝PHP, 也是一行...好!
yum -y install php
##php test page
cd /var/www/html/
vim index.php
<?php
phpinfo();
?>
##restart httpd
service httpd restart
##test php
xxxx/index.php
##安裝phpPgAdmin...真好~
##不過有發生過yum找不到package...不知道是剛好被關掉還那邊網路檔掉??
yum install phpPgAdmin
#check conf, 預設就只開localhost, 不錯有sense...一般應用就是看情況再多加Allow
cat /etc/httpd/conf.d/phpPgAdmin.conf
<Location /phpPgAdmin>
Order deny,allow
Deny from all
Allow from 127.0.0.1
Allow from ::1
# Allow from .example.com
</Location>
cat /etc/php.ini
#如果不知道被東西被裝到那邊去~ search file一下
#check where? (test ok it is /usr/share/phpPgAdmin)
find / -name phpPgAdmin
exit;
### 安裝湯姆貓, 個人習慣使用zip安裝, yum資料夾結構亂跑很難處理
### 使用一般user, 裝在預設8080就好
mkdir tomcat
cd tomcat
cp ~/apache-tomcat-7.0.26.tar.gz .
tar -zxv -f apache-tomcat-7.0.26.tar.gz
cd apache-tomcat-7.0.26
bin/startup.sh
## open browser link xxxx:8080 to test
### more setting
vim bin/catalina.sh
add
JAVA_OPTS="-Xms128m -Xmx1024m"
### use 80(apache2 proxy, ajp)
su
vim /etc/httpd/conf/httpd.conf
##(G)加在最底下,
##不要用/根目錄 轉導~ 會導致php或其他apache目錄被搶走~
##依需要使用對應之路徑就好
<VirtualHost *:80>
ServerName mybestappp.com
ServerAlias www.mybestapp.com
ProxyRequests Off
ProxyPreserveHost On
ErrorLog /etc/xxx/log/httpd.tomcat.error.log
CustomLog /etc/xxx/log/httpd.tomcat.log combined
<Proxy *>
Order deny,allow
Allow from all
</Proxy>
ProxyPass /tomcat ajp://localhost:8009/
ProxyPassReverse /tomcat ajp://localhost:8009/
ProxyPass /mywebapp ajp://localhost:8009/mywebapp
ProxyPassReverse /mywebapp ajp://localhost:8009/mywebapp
</VirtualHost>
service httpd restart
2013年10月22日 星期二
centos 6.3 service auto start
對server來說~ 有很多service要開機啟動~ 以免發生搞笑的事情...
尤其是ssh !!...跑機房是很痛苦的T__T...
在安裝OS完, ssh 已經預設安裝成service~ 但是"並沒有被啟動", "也未開機啟動"
所以就是只要啟動加設定啟動就好了~~另外還有一個前題重點...
一定要開防火牆(習慣在centos安裝完後~ 反正都會登進GUI畫面~ 用GUI設定防火牆還蠻簡單易懂的~ 就順手先做一下)
另外開機啟動~ AP應該都被設定在level3 或 5啟動~ 各level的定義可以參考
http://ithelp.ithome.com.tw/question/10078093
RunLevel (SysV init) 啟動模式等級可以分為 7 種等級分別是 0 ~ 6,而這 7 個數字所分別代表的意義如下:
0 (halt):系統關機,若您將 RunLevel 設為此模式,則會發現當開機程式完成後系統就直接關機。
1 (Single user mode):單人模式,通常在系統發生問題需要維護時才會進入此一模式。
2 (Multiuser without NFS):多人模式但沒有 NFS 網路功能,通常用於多人多工但不需要網路功能時,才會進入此模式。
3 (Full multiuser mode):多人文字模式,此模式為不需要進入視窗模式,並且具備完整網路功能的管理者所使用的模式。
4 (unused):尚未使用,使用者可以自行定義。
5 (X11):多人圖形模式,此模式為習慣使用視窗模式,並且具備完整網路功能的管理者所使用的模式。
6 (reboot):重新啟動,若您將 RunLevel 設為此模式則會發現當開機程式完成後,系統就直接重新啟動。
##firewall setting
##open 5432(postgresql), 80(http), 22(ssh)
##start ssh
service sshd restart
## assign service auto start
su
chkconfig --list
chkconfig --level 5 sshd on
chkconfig --level 5 postgresql-9.1 on
chkconfig --level 5 httpd on
centos 6.3 + postgresql 9.1
centos 6.3 其實有內建 postgresql 8.4, 若不想升就不需要處理...
不過若要升就要先把舊的移除~
另外有一點要注意的是~ postgresql 會把 系統帳號(就是ssh用的帳號) 和 db內的帳號結合起來, 簡單的說~ 就是系統帳號和db帳號要用同一個~ 若db connect/owner不用super user(postgres), 就要注意!!(其實設定檔上也可以改設定的樣子~ localhost最上面那群~ 不過在建置時還要create 一大堆schema~ 還要橋那個設定是有點困擾Orz...)
習慣上我會使用centos上的基本user(就是安裝時會設定的那個普通user做 account, 比較沒問題)
另外試了很多次~ 雖然看網路上寫~ 安裝啟動似乎可以不用root, 但是~ 沒root九成都碰到權限問題(裝過五六次~就只有一次不知道未什麼不用XD)...後來就乾脆用root起~ 反正db都是做成service在跑~ root比較沒問題~ 至於權限就控管在db owner底下做處理就好...
user account我大概都是安裝用root, 改設定檔用postgres, psql看情況使用postgres或是自訂user, 設定檔的認證方式~若使用trust 會真的完全不認密碼..個人試過後還是md5比較好...
initdb失敗時~ 會產生一些log檔在data底下~ 基本上就是log看一看後就可以砍掉~ 因為initdb會要求要一個空空的資料夾(yum裝下去應該都是在 /var/lib/pgsql/9.1/data)
su
## remove postgresql inside(default is 8)
yum erase postgresql
## install postgresql 9.1
wget http://yum.postgresql.org/9.1/redhat/rhel-6-x86_64/pgdg-centos91-9.1-4.noarch.rpm
rpm -ivh pgdg-centos91-9.1-4.noarch.rpm
yum install postgresql91-server (yum install postgresql91-server.x86_64)
adduser postgres
passwd postgres (xxxpwdxxx)
adduser myowner
passwd myowner (xxxpwdxxx)
### use psql to change pwd
###\password postgres
###\q
## init postgres db data
su postgres
cd /var/lib/pgsql/9.1/data (check owner must be postgres and EMPTY folder)
###if group not permission use root!!!
service postgresql-9.1 initdb
## config properties
su postgres
cd /var/lib/pgsql/9.1/data
vim pg_hba.conf (ident -> md5, add access ip range, ident means use another ident server check user/pwd)
#Add(local host ip must md5 for phpPgAdmin)
host all all 192.168.xx.1/24 md5
vim postgresql.conf
#Update
listen_addresses='*'
## restart postgresql
/etc/init.d/postgresql-9.1 restart
##or use service
service postgresql-9.1 restart
2013年10月15日 星期二
2013年8月14日 星期三
DataTubine 系統最低需求
http://www.dataturbine.org/content/system-requirements
System Requirements
What Do I Need to Run DataTurbine
DataTurbine is designed to run on any device from an industry-grade server to a low powered smart phone. It is both scalable and portable. Once the minimum requirements are satisfied additional constraints may be imposed by the needs of the specific project.
The Minimum Requirements:
Java Runtime Environment
JRE 1.5+ is required for the server
Sources/Sinks may have different requirements
Network Capabilities
Example of Compatible Systems
Personal Computer
Desktop, Laptop, Netbook
Window, Linux, Mac
32-bit, 64-bit
Server
Windows, Linux, Solaris, etc...
32-bit, 64-bit
Micro-computers
Gumstix Device
ARM devices
Cell phone
Android Device
System Requirements
What Do I Need to Run DataTurbine
DataTurbine is designed to run on any device from an industry-grade server to a low powered smart phone. It is both scalable and portable. Once the minimum requirements are satisfied additional constraints may be imposed by the needs of the specific project.
The Minimum Requirements:
Java Runtime Environment
JRE 1.5+ is required for the server
Sources/Sinks may have different requirements
Network Capabilities
Example of Compatible Systems
Personal Computer
Desktop, Laptop, Netbook
Window, Linux, Mac
32-bit, 64-bit
Server
Windows, Linux, Solaris, etc...
32-bit, 64-bit
Micro-computers
Gumstix Device
ARM devices
Cell phone
Android Device
DataTubine 讀書筆記3: Sink, Real-Time
http://www.dataturbine.org/content/sink
http://www.dataturbine.org/content/real-time
---------------------------------------------------------------
DataTurbine Sink
Introduction
A DataTurbine Sink (also refereed to as an 'off-ramp') is simply a program that takes data from a DataTurbine Server and utilizes it, for example brings it up in Matlab or Real-time Data Viewer or puts it into a relational database or file for permanent storage.
Just like a source, a sink runs independently from the server as a separate application and uses the network to communicate. It can run on the same machine as the server or on a machine across the world.
The Sink's Perspective
From the sink's point of view it no longer needs to know where the data came from or how it got there. It can query all the sources and channels to find out what is available or specify a single channel via its name and name of its source.
The data is heterogeneous and the sink could access any type of data seamlessly. It makes the decision on how to display and interpret the data via its data type (byte array, 32-bit float, 32-bit int, etc) as well as the MIME Type specified by the sink.
A sink can issue a request to pull data from the server in a timeframe. A sink could also subscribe to a specific set of channels getting data as it becomes available.
Example: For example a sink could get a listing of all the sources available on a server pick only the temperature channels, perform some analysis and based on the result bring up the images for the corresponding channels at significant time indexes
Common Types of Sinks
Viewer: An application that can be used to access and interact with the streaming data
Ex: Real-time Data Viewer (RDV), Google Earth, etc...
Web Server: An application that serves the data as web content for public display
Ex: Graphs on a public web site
Analysis: Takes the data and performs some kind of manual or automated analysis
Ex: Mat lab, R, ESPER, etc..
Export: Exports the data into a file or set of files for distribution or integration
Ex: CSV files, Excel, etc...
Storage: Permanent storage in a database or as a series of files.
Ex: Storage in a relational database
Other: Easy to code any kind of sink that utilizes the data
Practical Example (Continued):
Going back to the example used in the source. Imagine a simple meteorological tower that measures temperature and humidity on top of a hill. Nearby is a field station that is also measuring temperature. We put this data into DataTurbine on a laptop at the field station and now want to view it and make sure that it is placed in permanent storage.
Start a DataTurbine server on the laptop (rbnb.jar)
Start a source on the laptop reading data from the meteorological tower
Start a source on the laptop reading data from the field station
Start a sink to view the data as it is collected in real-time. In this case we will use Real-time Data Viewer (RDV)
Start a sink to put the data into permanent storage in a MySQL database.
Our laptop would now have five independent lightweight programs running (1 server, 2 sources, 2 sinks). We will probably keep the server, sources, and the permanent storage sink running at all times. But we will start and stop the viewer sink as we need it.
Now we have a very basic but complete deployment running. But we are not sharing the data and not really utilizing the power of a real-time system (Aside from viewing the data as it is collected). Fear not this will be discussed in further sections as we build on our example.
Power of Real-time
DataTurbine as a Real-time Data System
If you read through previous sections you can see some of the benefits of DataTurbine as a "black box" system, separating the sources from the sinks and handling heterogeneous data types in a unified system. However the primary reason to use DataTurbine is the ability to interact with data in real-time or near real-time.
DataTurbine is built around this constant and its limitations for historical data are a direct consequence of its strength and speed at working with streaming real-time data.
In addition to working with live data, DataTurbine can stream archived as if it were live, re-utilizing common data viewers and infrastructure for post-test data analysis and review.
What is Real-Time Data
Real-time data refers to delivering data as soon as it is collected. There is no delay in the timeliness of the information provided. This is in contrast to an archival system that stores data un till a later date.
DataTurbine can handle data sampled millions of times a second or as infrequently as once a century. In practice many uses are somewhere in between with data sampling every second, minute or hour.
As many remote sites can have drastic communication delays and do not require a strict time constraint, it would be more correct to refer to those systems as providing near real-time data but for the sake of simplicity they are often also grouped into the real-time category.
Also note that when we talk about real-time we are focusing on the availability of data not to be confused with real-time computing which focuses on guaranteed response within strict time constraints.
Benefits of Real-time Data
Interactive:
Failure:The most direct benefit of real-time data is the ability to respond to factors on the fly. If a sensor goes bad the system registers it immediately and can be fixed (before potentially months of data are ruined).
Important Event: If an event of importance occurs a team can be dispatched immediately to gather additional samples and observe the occurrence first hand.
Sampling: With a real-time system its possible to change sampling rates and activate and deactivate sensors based on the data they receive.
Example: If one sensor detects an important event perhaps the sensors in that region need to increase their sampling rate temporarily or a camera needs to be activated.
Analysis: There is a lot of analysis that can be performed on real-time data and in certain cases this is actually the more efficient route. Averages, correlations, and mathematical operations can be performed in real-time with ease. The derived data can be put back into DataTurbine and further utilized. The end result is that summary and analytic data is available on the fly giving an overview of the health of the system and the experiment.
Public Consumption: Real-time also gives added value to the data. Data can be published publicly as it is gathered. The same sensor network that is monitoring an ecosystem for scientific research can display the tides and temperature of the water, the wind speed and direction, even a video feed showing the view of the forest.
Portable: Streaming data is very portable. Adding destinations or applications is easy and transparent. Since data is contained as tuples (time,value, source) it is easy for any system to accept it and requires significantly less overhead then trying to read from a rigid structure such as a database. Once a streaming system is set up raw data, and automated analysis and quality assurance and quality control are available to any application and destination that the provider specifies the second it is available. Any additional analysis (which could take weeks or months) can then be amended later.
Funding Compliance: There is an increasing pressures by funding agencies for data providers to publicly publish data in a timely manner. A real-time system can help satisfy that compliance.
Limitations of Real-Time Data
Not a Replacement: A real-time data system would ideally be an addition not a replacement for an archival system. It should add to a system but makes a poor replacement for operations that are best suited to an archive such as a relational database.
Data Quality: Data coming directly from sensors will have inherent imperfections which have to be cleaned away before consumption. Unlike an archival system which often just provides the cleanest most annotated data, a real-time system would ideally have multiple data levels of progressively cleaner data.
Automated Cleaning: Automated QA/QC can be performed on a real-time stream to identify obvious inconsistencies and potentially problematic parts of the data.
Levels of Assurance: Different applications require a different level of assurance. For example a local weather site could use nearly raw data, while an intricate carbon dioxide absorption experiment would utilize manually cleaned and validated data.
Different Paradigm: While traditional analysis would still work on archived data, utilizing the real-time aspect of data often requires a different approach then analysis on archived data.
---------------------------------------------------
DataTurbine水槽介紹DataTurbine Sink(也作為一個'off-ramp')是一個簡單的程序,數據一個DataTurbine服務器,並利用它,例如把它在Matlab或實時數據查看器或把它放到一個關係型數據庫永久存儲或文件。就像一個源,一個接收器獨立運行,從服務器作為一個單獨的應用程序,並且使用網絡進行通信。它可以運行在同一台機器作為服務器或世界各地的一台機器上。水槽的角度從水槽的角度來看,它不再需要知道從哪裡傳來的數據或如何到達那裡。它可以查詢找出什麼是可用的,或者指定一個單一的通道,通過它的名字和其來源名稱來源和渠道。數據是異構和水槽可以無縫地訪問任何類型的數據。這使得決定如何顯示和解釋數據通過它的數據類型(字節數組,32位浮點,32位的int等),以及指定的MIME類型水槽。一個接收器可以發出請求,將數據從服務器的時間表。一個接收器還可以訂閱到一組特定的渠道獲取數據,因為它成為可用。例:例如,一個接收器可以得到一個上市的所有源服務器上可用的只有溫度的渠道,進行一些分析,並根據結果提出相應通道的圖像顯著的時間索引常見類型的水槽
查看器中:一個應用程序可以被用於訪問和互動的流數據
例如:實時數據查看器(RDV),谷歌地球等..
Web服務器:一個應用程序,提供Web內容的數據作為公開展示
例如:一個公共網站上的圖
分析:取數據,並執行某種手動的或自動的分析
例如:墊的實驗室,R,ESPER等。
出口:出口數據分佈或融合成一個文件或文件集
例如:CSV文件時,Excel,等等。
貯藏:永久存儲在數據庫或一系列文件。
例如:存儲在關係數據庫中
其他:便於代碼的任何一種接收器,利用數據實例(續):回去用在源的例子。想像一下,一個簡單的氣象塔,測量溫度和濕度在一個小山頂上。附近是一個場站,這也是測量溫度。我們把這個數據到DataTurbine對場站的一台筆記本電腦,現在要查看它,並確保它被放置在永久存儲。
啟動一個DataTurbine服務器上的筆記本電腦(rbnb.jar)
從氣象塔在筆記本電腦上讀取數據,啟動源
在筆記本電腦上讀取數據,從場站啟動源
啟動一個接收器來查看數據,因為它是實時採集。在這種情況下,我們將使用實時數據查看器(RDV)
啟動一個接收器,把數據轉換成永久存儲在MySQL數據庫中。現在,我們的筆記本電腦將有五個獨立的輕量級運行的程序(1個服務器,2個數據源,2個水槽)。我們可能會保持服務器,來源和運行在任何時候都永久存儲片。但是,我們將開始和停止觀眾片,因為我們需要它。現在我們有一個非常基本的,但完整的部署運行。但我們不會共享數據並沒有真正利用一個實時系統的力量(除了查看收集的數據,因為它)。不要害怕,這將是在進一步的章節中討論,因為我們建立我們的例子中。
Power Real-Time作為一個實時數據系統DataTurbine如果你通讀前面的章節中,你可以看到一些好處DataTurbine作為一個“黑盒子”系統,分離源匯和處理在一個統一的系統中的異構數據類型。然而,主要的原因使用DataTurbine是在實時或近實時的數據進行交互的能力。DataTurbine是圍繞這個常數,並在工作流的實時數據,歷史數據有其局限性的直接後果是它的力量和速度。除了工作的實時數據,可以流歸檔DataTurbine就好像它是活的,再利用常見的數據後測試數據的分析和審查的觀眾和基礎設施。什麼是實時數據實時數據是指提供數據,只要它被收集。在提供信息的時效性不存在延遲。這是檔案系統數據未存儲直到日後對比。DataTurbine可以處理數據採樣數百萬次,第二次或很少,因為一旦一個世紀。在實踐中,許多的用途是每一秒,分鐘或小時的數據採樣之間的某處。由於許多遠程站點可以有激烈的通信延遲,且不需要嚴格的時間約束,這將是更正確指這些系統提供近實時的數據,但為簡單起見,他們往往還分為實時時間類。還要注意的是,當我們談論實時我們的重點是專注於嚴格的時間限制內響應保證數據不被混淆與實時計算的可用性。實時數據的優勢
互動:
失敗:最直接的好處是實時數據的反應能力上飛的因素。如果傳感器變壞系統寄存器立即可以是固定的(潛在個月的數據破壞之前)。
重要事件:如果發生的重要事件,立即派出一個團隊可以收集更多的樣本,並觀察發生的第一手資料。
採樣:一個實時系統,它可能改變採樣率,並根據他們收到的數據的激活和停用傳感器。
實施例:如果一個傳感器檢測到一個重要的事件,在該區域的傳感器可能需要增加採樣率暫時或相機需要被激活。
分析:有大量的分析,可以執行實時數據和在某些情況下,這實際上是更有效的途徑。平均,相關性和數學運算,可以進行實時提供方便。導出的數據可以被放回到DataTurbine和進一步利用。最終的結果是,匯總和分析的數據提供給系統健康狀況和實驗的概觀上的蒼蠅。
公共消費:實時還提供附加價值的數據。數據可以公佈,因為它是聚集。相同的傳感器網絡,監測生態系統的科研可以顯示潮汐和溫度的水,風的速度和方向,甚至是視頻飼料森林景觀。
便攜式流數據是非常便攜。添加目的地或應用程序簡單和透明。由於數據包含元組(時間,價值,源)系統接受它很容易,需要明顯較少的開銷,然後試圖讀取從剛性結構(如數據庫)。一旦流系統設置原始數據,並自動分析和質量保證和質量控制,提供指定第二個它是可用的任何應用程序和目標。任何額外的分析(這可能需要數週或數月),然後可以修改。
資助標準:資助機構的數據提供商,及時公開發布的數據,是一個越來越大的壓力。一個實時系統可以幫助滿足合規性。實時數據的局限性
不能代替:一個實時數據系統,將理想的歸檔系統的補充而不是替代。它應該添加到系統中,但使一個貧窮的替代品是最適合,如關係數據庫中的歸檔操作。
數據質量:直接來自傳感器的數據,將有消費前要清洗的固有缺陷。不同的檔案系統,這往往只是提供了最乾淨的標註數據,理想情況下,一個實時的系統將有多個數據水平的逐步清晰的數據。
自動清洗:可以進行自動化的QA / QC找出明顯的不一致和潛在問題的部分數據的實時流。
層次的保障:不同的應用需要不同程度的保證。例如,一個當地的天氣網站可以使用接近原始數據,而一個複雜的吸收二氧化碳的實驗將利用手動清洗和驗證數據。
不同的模式:雖然傳統的分析仍然對歸檔數據的工作,利用實時數據方面往往需要不同的方法分析歸檔數據。
http://www.dataturbine.org/content/real-time
---------------------------------------------------------------
DataTurbine Sink
Introduction
A DataTurbine Sink (also refereed to as an 'off-ramp') is simply a program that takes data from a DataTurbine Server and utilizes it, for example brings it up in Matlab or Real-time Data Viewer or puts it into a relational database or file for permanent storage.
Just like a source, a sink runs independently from the server as a separate application and uses the network to communicate. It can run on the same machine as the server or on a machine across the world.
The Sink's Perspective
From the sink's point of view it no longer needs to know where the data came from or how it got there. It can query all the sources and channels to find out what is available or specify a single channel via its name and name of its source.
The data is heterogeneous and the sink could access any type of data seamlessly. It makes the decision on how to display and interpret the data via its data type (byte array, 32-bit float, 32-bit int, etc) as well as the MIME Type specified by the sink.
A sink can issue a request to pull data from the server in a timeframe. A sink could also subscribe to a specific set of channels getting data as it becomes available.
Example: For example a sink could get a listing of all the sources available on a server pick only the temperature channels, perform some analysis and based on the result bring up the images for the corresponding channels at significant time indexes
Common Types of Sinks
Viewer: An application that can be used to access and interact with the streaming data
Ex: Real-time Data Viewer (RDV), Google Earth, etc...
Web Server: An application that serves the data as web content for public display
Ex: Graphs on a public web site
Analysis: Takes the data and performs some kind of manual or automated analysis
Ex: Mat lab, R, ESPER, etc..
Export: Exports the data into a file or set of files for distribution or integration
Ex: CSV files, Excel, etc...
Storage: Permanent storage in a database or as a series of files.
Ex: Storage in a relational database
Other: Easy to code any kind of sink that utilizes the data
Practical Example (Continued):
Going back to the example used in the source. Imagine a simple meteorological tower that measures temperature and humidity on top of a hill. Nearby is a field station that is also measuring temperature. We put this data into DataTurbine on a laptop at the field station and now want to view it and make sure that it is placed in permanent storage.
Start a DataTurbine server on the laptop (rbnb.jar)
Start a source on the laptop reading data from the meteorological tower
Start a source on the laptop reading data from the field station
Start a sink to view the data as it is collected in real-time. In this case we will use Real-time Data Viewer (RDV)
Start a sink to put the data into permanent storage in a MySQL database.
Our laptop would now have five independent lightweight programs running (1 server, 2 sources, 2 sinks). We will probably keep the server, sources, and the permanent storage sink running at all times. But we will start and stop the viewer sink as we need it.
Now we have a very basic but complete deployment running. But we are not sharing the data and not really utilizing the power of a real-time system (Aside from viewing the data as it is collected). Fear not this will be discussed in further sections as we build on our example.
Power of Real-time
DataTurbine as a Real-time Data System
If you read through previous sections you can see some of the benefits of DataTurbine as a "black box" system, separating the sources from the sinks and handling heterogeneous data types in a unified system. However the primary reason to use DataTurbine is the ability to interact with data in real-time or near real-time.
DataTurbine is built around this constant and its limitations for historical data are a direct consequence of its strength and speed at working with streaming real-time data.
In addition to working with live data, DataTurbine can stream archived as if it were live, re-utilizing common data viewers and infrastructure for post-test data analysis and review.
What is Real-Time Data
Real-time data refers to delivering data as soon as it is collected. There is no delay in the timeliness of the information provided. This is in contrast to an archival system that stores data un till a later date.
DataTurbine can handle data sampled millions of times a second or as infrequently as once a century. In practice many uses are somewhere in between with data sampling every second, minute or hour.
As many remote sites can have drastic communication delays and do not require a strict time constraint, it would be more correct to refer to those systems as providing near real-time data but for the sake of simplicity they are often also grouped into the real-time category.
Also note that when we talk about real-time we are focusing on the availability of data not to be confused with real-time computing which focuses on guaranteed response within strict time constraints.
Benefits of Real-time Data
Interactive:
Failure:The most direct benefit of real-time data is the ability to respond to factors on the fly. If a sensor goes bad the system registers it immediately and can be fixed (before potentially months of data are ruined).
Important Event: If an event of importance occurs a team can be dispatched immediately to gather additional samples and observe the occurrence first hand.
Sampling: With a real-time system its possible to change sampling rates and activate and deactivate sensors based on the data they receive.
Example: If one sensor detects an important event perhaps the sensors in that region need to increase their sampling rate temporarily or a camera needs to be activated.
Analysis: There is a lot of analysis that can be performed on real-time data and in certain cases this is actually the more efficient route. Averages, correlations, and mathematical operations can be performed in real-time with ease. The derived data can be put back into DataTurbine and further utilized. The end result is that summary and analytic data is available on the fly giving an overview of the health of the system and the experiment.
Public Consumption: Real-time also gives added value to the data. Data can be published publicly as it is gathered. The same sensor network that is monitoring an ecosystem for scientific research can display the tides and temperature of the water, the wind speed and direction, even a video feed showing the view of the forest.
Portable: Streaming data is very portable. Adding destinations or applications is easy and transparent. Since data is contained as tuples (time,value, source) it is easy for any system to accept it and requires significantly less overhead then trying to read from a rigid structure such as a database. Once a streaming system is set up raw data, and automated analysis and quality assurance and quality control are available to any application and destination that the provider specifies the second it is available. Any additional analysis (which could take weeks or months) can then be amended later.
Funding Compliance: There is an increasing pressures by funding agencies for data providers to publicly publish data in a timely manner. A real-time system can help satisfy that compliance.
Limitations of Real-Time Data
Not a Replacement: A real-time data system would ideally be an addition not a replacement for an archival system. It should add to a system but makes a poor replacement for operations that are best suited to an archive such as a relational database.
Data Quality: Data coming directly from sensors will have inherent imperfections which have to be cleaned away before consumption. Unlike an archival system which often just provides the cleanest most annotated data, a real-time system would ideally have multiple data levels of progressively cleaner data.
Automated Cleaning: Automated QA/QC can be performed on a real-time stream to identify obvious inconsistencies and potentially problematic parts of the data.
Levels of Assurance: Different applications require a different level of assurance. For example a local weather site could use nearly raw data, while an intricate carbon dioxide absorption experiment would utilize manually cleaned and validated data.
Different Paradigm: While traditional analysis would still work on archived data, utilizing the real-time aspect of data often requires a different approach then analysis on archived data.
---------------------------------------------------
DataTurbine水槽介紹DataTurbine Sink(也作為一個'off-ramp')是一個簡單的程序,數據一個DataTurbine服務器,並利用它,例如把它在Matlab或實時數據查看器或把它放到一個關係型數據庫永久存儲或文件。就像一個源,一個接收器獨立運行,從服務器作為一個單獨的應用程序,並且使用網絡進行通信。它可以運行在同一台機器作為服務器或世界各地的一台機器上。水槽的角度從水槽的角度來看,它不再需要知道從哪裡傳來的數據或如何到達那裡。它可以查詢找出什麼是可用的,或者指定一個單一的通道,通過它的名字和其來源名稱來源和渠道。數據是異構和水槽可以無縫地訪問任何類型的數據。這使得決定如何顯示和解釋數據通過它的數據類型(字節數組,32位浮點,32位的int等),以及指定的MIME類型水槽。一個接收器可以發出請求,將數據從服務器的時間表。一個接收器還可以訂閱到一組特定的渠道獲取數據,因為它成為可用。例:例如,一個接收器可以得到一個上市的所有源服務器上可用的只有溫度的渠道,進行一些分析,並根據結果提出相應通道的圖像顯著的時間索引常見類型的水槽
查看器中:一個應用程序可以被用於訪問和互動的流數據
例如:實時數據查看器(RDV),谷歌地球等..
Web服務器:一個應用程序,提供Web內容的數據作為公開展示
例如:一個公共網站上的圖
分析:取數據,並執行某種手動的或自動的分析
例如:墊的實驗室,R,ESPER等。
出口:出口數據分佈或融合成一個文件或文件集
例如:CSV文件時,Excel,等等。
貯藏:永久存儲在數據庫或一系列文件。
例如:存儲在關係數據庫中
其他:便於代碼的任何一種接收器,利用數據實例(續):回去用在源的例子。想像一下,一個簡單的氣象塔,測量溫度和濕度在一個小山頂上。附近是一個場站,這也是測量溫度。我們把這個數據到DataTurbine對場站的一台筆記本電腦,現在要查看它,並確保它被放置在永久存儲。
啟動一個DataTurbine服務器上的筆記本電腦(rbnb.jar)
從氣象塔在筆記本電腦上讀取數據,啟動源
在筆記本電腦上讀取數據,從場站啟動源
啟動一個接收器來查看數據,因為它是實時採集。在這種情況下,我們將使用實時數據查看器(RDV)
啟動一個接收器,把數據轉換成永久存儲在MySQL數據庫中。現在,我們的筆記本電腦將有五個獨立的輕量級運行的程序(1個服務器,2個數據源,2個水槽)。我們可能會保持服務器,來源和運行在任何時候都永久存儲片。但是,我們將開始和停止觀眾片,因為我們需要它。現在我們有一個非常基本的,但完整的部署運行。但我們不會共享數據並沒有真正利用一個實時系統的力量(除了查看收集的數據,因為它)。不要害怕,這將是在進一步的章節中討論,因為我們建立我們的例子中。
Power Real-Time作為一個實時數據系統DataTurbine如果你通讀前面的章節中,你可以看到一些好處DataTurbine作為一個“黑盒子”系統,分離源匯和處理在一個統一的系統中的異構數據類型。然而,主要的原因使用DataTurbine是在實時或近實時的數據進行交互的能力。DataTurbine是圍繞這個常數,並在工作流的實時數據,歷史數據有其局限性的直接後果是它的力量和速度。除了工作的實時數據,可以流歸檔DataTurbine就好像它是活的,再利用常見的數據後測試數據的分析和審查的觀眾和基礎設施。什麼是實時數據實時數據是指提供數據,只要它被收集。在提供信息的時效性不存在延遲。這是檔案系統數據未存儲直到日後對比。DataTurbine可以處理數據採樣數百萬次,第二次或很少,因為一旦一個世紀。在實踐中,許多的用途是每一秒,分鐘或小時的數據採樣之間的某處。由於許多遠程站點可以有激烈的通信延遲,且不需要嚴格的時間約束,這將是更正確指這些系統提供近實時的數據,但為簡單起見,他們往往還分為實時時間類。還要注意的是,當我們談論實時我們的重點是專注於嚴格的時間限制內響應保證數據不被混淆與實時計算的可用性。實時數據的優勢
互動:
失敗:最直接的好處是實時數據的反應能力上飛的因素。如果傳感器變壞系統寄存器立即可以是固定的(潛在個月的數據破壞之前)。
重要事件:如果發生的重要事件,立即派出一個團隊可以收集更多的樣本,並觀察發生的第一手資料。
採樣:一個實時系統,它可能改變採樣率,並根據他們收到的數據的激活和停用傳感器。
實施例:如果一個傳感器檢測到一個重要的事件,在該區域的傳感器可能需要增加採樣率暫時或相機需要被激活。
分析:有大量的分析,可以執行實時數據和在某些情況下,這實際上是更有效的途徑。平均,相關性和數學運算,可以進行實時提供方便。導出的數據可以被放回到DataTurbine和進一步利用。最終的結果是,匯總和分析的數據提供給系統健康狀況和實驗的概觀上的蒼蠅。
公共消費:實時還提供附加價值的數據。數據可以公佈,因為它是聚集。相同的傳感器網絡,監測生態系統的科研可以顯示潮汐和溫度的水,風的速度和方向,甚至是視頻飼料森林景觀。
便攜式流數據是非常便攜。添加目的地或應用程序簡單和透明。由於數據包含元組(時間,價值,源)系統接受它很容易,需要明顯較少的開銷,然後試圖讀取從剛性結構(如數據庫)。一旦流系統設置原始數據,並自動分析和質量保證和質量控制,提供指定第二個它是可用的任何應用程序和目標。任何額外的分析(這可能需要數週或數月),然後可以修改。
資助標準:資助機構的數據提供商,及時公開發布的數據,是一個越來越大的壓力。一個實時系統可以幫助滿足合規性。實時數據的局限性
不能代替:一個實時數據系統,將理想的歸檔系統的補充而不是替代。它應該添加到系統中,但使一個貧窮的替代品是最適合,如關係數據庫中的歸檔操作。
數據質量:直接來自傳感器的數據,將有消費前要清洗的固有缺陷。不同的檔案系統,這往往只是提供了最乾淨的標註數據,理想情況下,一個實時的系統將有多個數據水平的逐步清晰的數據。
自動清洗:可以進行自動化的QA / QC找出明顯的不一致和潛在問題的部分數據的實時流。
層次的保障:不同的應用需要不同程度的保證。例如,一個當地的天氣網站可以使用接近原始數據,而一個複雜的吸收二氧化碳的實驗將利用手動清洗和驗證數據。
不同的模式:雖然傳統的分析仍然對歸檔數據的工作,利用實時數據方面往往需要不同的方法分析歸檔數據。
DataTubine 讀書筆記2 : Server, Source
出處
http://www.dataturbine.org/content/server
http://www.dataturbine.org/content/source
官網有圖比較好懂~~
大概就是要知道資料定義時
Name: 名稱...
Target Server: 總要知道server在那吧~囧a
Channel: 可以有多個~ 我這專案主要是讀資料~ 找對channel才對QQ
Cache Size: 因為不是建server給人用, 就還好
---------------------------------------------------
DataTurbine Server
What is RBNB?
The DataTurbine server is contained in rbnb.jar it is the core of DataTurbine and is used as a center point that applications (sources and sinks) interface with.
It is not a replacement for a database and is designed for speed. Because of this although it is possible to store years of data in a DataTurbine server, for most applications data is also archived in permanent storage in a database.
The acronym RBNB stands for Ring Buffered Network Bus, and is the technology inside the DataTurbine server. To data sources (applications that generate data), it acts as a middleware ring buffer which stores heterogeneous time-sequenced data. To data sinks (applications that read data), it acts as a consolidated repository of data. Key to RBNB scalability is each source (ring buffer) and sink (network bus connection) act independently of each other.
The DataTurbine Server
It can be thought of as a series of rotating disks (a ring buffer) with new data being added and old data removed when the archive becomes full.
Source (applications that add data to the server) will specify their own archive sizes and cache size. Each source can specify its own archive and cache sizes.
The archives size specified by a source determines the size of it's ring buffer and how much data is buffered before it is discarded. DataTurbine can use as much storage as a systems physical drives allows. A good value depend on the storage space of the device the server is running on and the needs of the project.
The cache size specified by a source determines how much of it's ring buffer is contained in memory (RAM). This is again determined by the nature of the system is running on and the applications. A cache can increase speed, but a bigger cache does not necessarily mean a faster system.
This approach allows applications to interact with data in near real-time. Sinks can read data as it is collected and display it online, in Matlab, or other applications. Sinks can also interact with the data and move it into permanent storage.
The server is agnostic to the data it receives and can accept heterogeneous data types including numerical, video, audio, text, or any other digital medium. It acts as a black box with sources adding in data and sinks reading the data out.
The server expects an accurate timestamp for every data point. One limitation of this is that data cannot be back-loaded into the server. That means that data has to be entered sequentially and so for a give source each data point has to have a timestamp that is greater than the previous timestamp on record.
What is a Frame
Sizes are specified in the number of frames. Each time a source application flushes data it adds one frame. A frame is a data structure of one or more channels, with 1 or more data objects per channel. Thus the size of a frame may be small to large, and may vary frame to frame.
DataTurbine Source
Introduction
A DataTurbine Source (also refereed to as an 'on-ramp') is a program that takes data from a target (for example a sensor or file) and puts it into a DataTurbine server.
A source runs independently from the server as a separate application and uses the network to communicate. It can run on the same machine as the server or across the world.
Each source can contain multiple channels each with its own data type. It controls its own server-side memory and hard drive space allocation
Anatomy of a Source
Name: Identifies the source
Target Server: The server the source sends data to
Cache Size: Each source specifies how many frames of data to buffer for itself in the server's memory (RAM).
Archive Size: Each source specifies how many frames of data to store on the server's hard drive.
Multiple Channels: Data stream containing one type of data (for example numeric or video).
In turn each channel consists of a :
Name: Identifies the specific channel
MIME Type: Media type the applications can use to make decisions about the data they are receiving. Each channel can only store one type of data.
Data: Series of data points consisting of a time and value
Practical Example:
For example let us imagine a simple meteorological tower that measures temperature and humidity on top of a hill. Nearby is a field station that is also measuring temperature. We want to get this data into DataTurbine on a laptop at the field station. Lets go over what we would do.
Assuming we have custom sources for our instrumentation.
Start a DataTurbine Server on the laptop (rbnb.jar)
Start a source on the laptop targeting our server that reads data from the meteorological tower and puts it into DataTurbine. This source would contain two channels (temperature & humidity)
Start another source on the laptop that reads from the local field station and writes puts the data into the server. This source would contain a single channel (temperature)
Our laptop would now have three independent lightweight programs running. And now that we have the data in the server we now need a way to access it. This is discussed in the next section.
PlugIns
PlugIns are a specialized on-request type of data source. Whereas regular sources proactively push data to the DT server, plugins reply with data in response to sink requests forwarded to them via their plugin server connection.
Things to Keep in Mind
Each channel can only have one data type associated with it. Also remember that data cannot be back-loaded into the server. For each channel data has to be entered sequentially and so for a given channel each data point has to have a timestamp that is greater than the previous timestamp on record.
----------------------------------------------------
DataTurbine 伺服器
什麼是RBNB?在rbnb.jar它的核心是DataTurbine被用作一個中心點,應用程序(源Source和匯Sink)接口與的DataTurbine服務器。它不是一個數據庫的一個替代品,是專為速度。正因為如此,雖然它是可以存儲年在一個DataTurbine服務器的數據,對於大多數應用程序數據的存檔在永久存儲在數據庫中。縮寫RBNB代表環緩衝網絡總線技術裡面DataTurbine服務器的。到數據源(生成數據的應用程序),它作為一個中間件異質性的時間序列數據存儲的環形緩衝器。數據接收器(讀取數據的應用程序),它作為一個綜合的數據存儲庫。擴展性RBNB的關鍵是每個源(環形緩衝區)和接收器(連接網絡總線)彼此獨立行事。的DataTurbine服務器它可以被認為是一系列的旋轉圓盤(環形緩衝器)被添加的新數據和舊數據存檔已滿時,將刪除。源(數據添加到服務器的應用程序),將指定自己的存檔大小和緩存大小。每個源可以指定自己的存檔和高速緩存大小。源所指定的檔案大小的大小決定了它的環形緩衝區和緩衝多少數據被丟棄之前。 DataTurbine可以使用盡可能多的存儲系統的物理驅動器允許。一個很好的價值依賴於服務器上運行的設備的存儲空間和項目的需求。由源指定高速緩存的大小決定多少被包含在它的環形緩衝存儲器(RAM)。這又是由該系統的性質上運行的應用程序。高速緩存可以提高速度,但並不一定意味著一個更大的高速緩存更快的系統。這種方法允許應用程序在近實時的數據交互。水槽,因為它可以讀取數據被收集並在網上顯示,在Matlab或其他應用程序。水槽也可以與數據交互,並把它移動到永久存儲。該服務器是不可知的數據接收和可以接受的異構數據類型,包括數字,視頻,音頻,文字,或任何其他數字媒體。它作為一個黑盒子的來源,添加數據和匯讀取數據出來。服務器期望接收的每一個數據點的準確時間戳。這方面的一個限制是,數據不能被加載到服務器。這意味著,數據必須依次輸入,所以每個數據點都必須有大於以前的時間戳記錄的時間戳給源。什麼是框架尺寸的指定的幀的數目。每次刷新數據源應用程序,它增加了一個框架。車架的數據結構的一個或多個通道,每個通道的1個或多個數據對象。因此,一幀的尺寸可從小到大,並可能會發生變化幀到幀。
DataTurbine來源介紹DataTurbine來源(也叫做'on-ramp')是一個程序,需要從一個目標(例如傳感器或文件)的數據,並把它成DataTurbine服務器。一位知情人士獨立作為一個單獨的應用程序從服務器上運行,並使用網絡溝通。它可以運行在同一台機器作為服務器或世界各地。每個源可包含多個通道,每個通道有它自己的數據類型。它控制它自己的服務器端的內存和硬盤空間的分配解剖的來源
名稱:標識源
目標服務器:該服務器將數據發送到源
高速緩存大小:每個源指定多少幀的數據緩衝本身在服務器的內存(RAM)。
存檔大小:每個源指定多少幀的數據存儲在服務器的硬盤驅動器。
多通道數據流包含一種類型的數據(例如,數字或視頻)。
反過來,每個通道都包括一個:
名稱:標識特定通道
MIME類型:媒體類型的應用程序可以使用他們所接收的數據作出決策。每個通道都可以只存儲一種類型的數據。
數據系列的數據點,包括時間和價值實際的例子:例如,讓我們想像一個簡單測量溫度和濕度的氣象塔,在一個小山頂上。附近是一個場站,這也是測量溫度。我們想要得到這個數據DataTurbine在筆記本電腦上的場站。讓我們走了過來,我們會做什麼。假設我們有我國儀器儀表的自定義來源。
啟動一個DataTurbine服務器上的筆記本電腦(rbnb.jar)
啟動筆記本電腦上的針對我們的服務器中讀取數據的氣象塔,並把它放到DataTurbine源。此源將包含兩個通道(溫度和濕度)
局部場站的筆記本電腦,讀取和寫入將數據放入服務器上啟動另一個來源。此源將包含一個單通道(溫度)現在,我們的筆記本電腦將有三個獨立運行的輕量級程序。現在我們已經在服務器中的數據,我們現在需要一種方法來訪問它。這將在下一節討論。插件插件是一本專門關於請求類型的數據源。鑑於常規渠道主動推送數據的DT服務器,插件回复響應的數據,通過他們的插件服務器連接請求轉發給它們的下沉。要記住的事情每個通道只能有一個與它相關聯的數據類型。還記得數據不能被備份到服務器加載。對於每個通道數據已被依次輸入,因此對於一個給定的信道,每個數據點具有大於先前已記錄的時間戳的時間戳是。
http://www.dataturbine.org/content/server
http://www.dataturbine.org/content/source
官網有圖比較好懂~~
大概就是要知道資料定義時
Name: 名稱...
Target Server: 總要知道server在那吧~囧a
Channel: 可以有多個~ 我這專案主要是讀資料~ 找對channel才對QQ
Cache Size: 因為不是建server給人用, 就還好
---------------------------------------------------
DataTurbine Server
What is RBNB?
The DataTurbine server is contained in rbnb.jar it is the core of DataTurbine and is used as a center point that applications (sources and sinks) interface with.
It is not a replacement for a database and is designed for speed. Because of this although it is possible to store years of data in a DataTurbine server, for most applications data is also archived in permanent storage in a database.
The acronym RBNB stands for Ring Buffered Network Bus, and is the technology inside the DataTurbine server. To data sources (applications that generate data), it acts as a middleware ring buffer which stores heterogeneous time-sequenced data. To data sinks (applications that read data), it acts as a consolidated repository of data. Key to RBNB scalability is each source (ring buffer) and sink (network bus connection) act independently of each other.
The DataTurbine Server
It can be thought of as a series of rotating disks (a ring buffer) with new data being added and old data removed when the archive becomes full.
Source (applications that add data to the server) will specify their own archive sizes and cache size. Each source can specify its own archive and cache sizes.
The archives size specified by a source determines the size of it's ring buffer and how much data is buffered before it is discarded. DataTurbine can use as much storage as a systems physical drives allows. A good value depend on the storage space of the device the server is running on and the needs of the project.
The cache size specified by a source determines how much of it's ring buffer is contained in memory (RAM). This is again determined by the nature of the system is running on and the applications. A cache can increase speed, but a bigger cache does not necessarily mean a faster system.
This approach allows applications to interact with data in near real-time. Sinks can read data as it is collected and display it online, in Matlab, or other applications. Sinks can also interact with the data and move it into permanent storage.
The server is agnostic to the data it receives and can accept heterogeneous data types including numerical, video, audio, text, or any other digital medium. It acts as a black box with sources adding in data and sinks reading the data out.
The server expects an accurate timestamp for every data point. One limitation of this is that data cannot be back-loaded into the server. That means that data has to be entered sequentially and so for a give source each data point has to have a timestamp that is greater than the previous timestamp on record.
What is a Frame
Sizes are specified in the number of frames. Each time a source application flushes data it adds one frame. A frame is a data structure of one or more channels, with 1 or more data objects per channel. Thus the size of a frame may be small to large, and may vary frame to frame.
DataTurbine Source
Introduction
A DataTurbine Source (also refereed to as an 'on-ramp') is a program that takes data from a target (for example a sensor or file) and puts it into a DataTurbine server.
A source runs independently from the server as a separate application and uses the network to communicate. It can run on the same machine as the server or across the world.
Each source can contain multiple channels each with its own data type. It controls its own server-side memory and hard drive space allocation
Anatomy of a Source
Name: Identifies the source
Target Server: The server the source sends data to
Cache Size: Each source specifies how many frames of data to buffer for itself in the server's memory (RAM).
Archive Size: Each source specifies how many frames of data to store on the server's hard drive.
Multiple Channels: Data stream containing one type of data (for example numeric or video).
In turn each channel consists of a :
Name: Identifies the specific channel
MIME Type: Media type the applications can use to make decisions about the data they are receiving. Each channel can only store one type of data.
Data: Series of data points consisting of a time and value
Practical Example:
For example let us imagine a simple meteorological tower that measures temperature and humidity on top of a hill. Nearby is a field station that is also measuring temperature. We want to get this data into DataTurbine on a laptop at the field station. Lets go over what we would do.
Assuming we have custom sources for our instrumentation.
Start a DataTurbine Server on the laptop (rbnb.jar)
Start a source on the laptop targeting our server that reads data from the meteorological tower and puts it into DataTurbine. This source would contain two channels (temperature & humidity)
Start another source on the laptop that reads from the local field station and writes puts the data into the server. This source would contain a single channel (temperature)
Our laptop would now have three independent lightweight programs running. And now that we have the data in the server we now need a way to access it. This is discussed in the next section.
PlugIns
PlugIns are a specialized on-request type of data source. Whereas regular sources proactively push data to the DT server, plugins reply with data in response to sink requests forwarded to them via their plugin server connection.
Things to Keep in Mind
Each channel can only have one data type associated with it. Also remember that data cannot be back-loaded into the server. For each channel data has to be entered sequentially and so for a given channel each data point has to have a timestamp that is greater than the previous timestamp on record.
----------------------------------------------------
DataTurbine 伺服器
什麼是RBNB?在rbnb.jar它的核心是DataTurbine被用作一個中心點,應用程序(源Source和匯Sink)接口與的DataTurbine服務器。它不是一個數據庫的一個替代品,是專為速度。正因為如此,雖然它是可以存儲年在一個DataTurbine服務器的數據,對於大多數應用程序數據的存檔在永久存儲在數據庫中。縮寫RBNB代表環緩衝網絡總線技術裡面DataTurbine服務器的。到數據源(生成數據的應用程序),它作為一個中間件異質性的時間序列數據存儲的環形緩衝器。數據接收器(讀取數據的應用程序),它作為一個綜合的數據存儲庫。擴展性RBNB的關鍵是每個源(環形緩衝區)和接收器(連接網絡總線)彼此獨立行事。的DataTurbine服務器它可以被認為是一系列的旋轉圓盤(環形緩衝器)被添加的新數據和舊數據存檔已滿時,將刪除。源(數據添加到服務器的應用程序),將指定自己的存檔大小和緩存大小。每個源可以指定自己的存檔和高速緩存大小。源所指定的檔案大小的大小決定了它的環形緩衝區和緩衝多少數據被丟棄之前。 DataTurbine可以使用盡可能多的存儲系統的物理驅動器允許。一個很好的價值依賴於服務器上運行的設備的存儲空間和項目的需求。由源指定高速緩存的大小決定多少被包含在它的環形緩衝存儲器(RAM)。這又是由該系統的性質上運行的應用程序。高速緩存可以提高速度,但並不一定意味著一個更大的高速緩存更快的系統。這種方法允許應用程序在近實時的數據交互。水槽,因為它可以讀取數據被收集並在網上顯示,在Matlab或其他應用程序。水槽也可以與數據交互,並把它移動到永久存儲。該服務器是不可知的數據接收和可以接受的異構數據類型,包括數字,視頻,音頻,文字,或任何其他數字媒體。它作為一個黑盒子的來源,添加數據和匯讀取數據出來。服務器期望接收的每一個數據點的準確時間戳。這方面的一個限制是,數據不能被加載到服務器。這意味著,數據必須依次輸入,所以每個數據點都必須有大於以前的時間戳記錄的時間戳給源。什麼是框架尺寸的指定的幀的數目。每次刷新數據源應用程序,它增加了一個框架。車架的數據結構的一個或多個通道,每個通道的1個或多個數據對象。因此,一幀的尺寸可從小到大,並可能會發生變化幀到幀。
DataTurbine來源介紹DataTurbine來源(也叫做'on-ramp')是一個程序,需要從一個目標(例如傳感器或文件)的數據,並把它成DataTurbine服務器。一位知情人士獨立作為一個單獨的應用程序從服務器上運行,並使用網絡溝通。它可以運行在同一台機器作為服務器或世界各地。每個源可包含多個通道,每個通道有它自己的數據類型。它控制它自己的服務器端的內存和硬盤空間的分配解剖的來源
名稱:標識源
目標服務器:該服務器將數據發送到源
高速緩存大小:每個源指定多少幀的數據緩衝本身在服務器的內存(RAM)。
存檔大小:每個源指定多少幀的數據存儲在服務器的硬盤驅動器。
多通道數據流包含一種類型的數據(例如,數字或視頻)。
反過來,每個通道都包括一個:
名稱:標識特定通道
MIME類型:媒體類型的應用程序可以使用他們所接收的數據作出決策。每個通道都可以只存儲一種類型的數據。
數據系列的數據點,包括時間和價值實際的例子:例如,讓我們想像一個簡單測量溫度和濕度的氣象塔,在一個小山頂上。附近是一個場站,這也是測量溫度。我們想要得到這個數據DataTurbine在筆記本電腦上的場站。讓我們走了過來,我們會做什麼。假設我們有我國儀器儀表的自定義來源。
啟動一個DataTurbine服務器上的筆記本電腦(rbnb.jar)
啟動筆記本電腦上的針對我們的服務器中讀取數據的氣象塔,並把它放到DataTurbine源。此源將包含兩個通道(溫度和濕度)
局部場站的筆記本電腦,讀取和寫入將數據放入服務器上啟動另一個來源。此源將包含一個單通道(溫度)現在,我們的筆記本電腦將有三個獨立運行的輕量級程序。現在我們已經在服務器中的數據,我們現在需要一種方法來訪問它。這將在下一節討論。插件插件是一本專門關於請求類型的數據源。鑑於常規渠道主動推送數據的DT服務器,插件回复響應的數據,通過他們的插件服務器連接請求轉發給它們的下沉。要記住的事情每個通道只能有一個與它相關聯的數據類型。還記得數據不能被備份到服務器加載。對於每個通道數據已被依次輸入,因此對於一個給定的信道,每個數據點具有大於先前已記錄的時間戳的時間戳是。
DataTubine 讀書筆記1
依據官網
http://www.dataturbine.org/content/documentation
和強大的google翻譯...這個沒有wiki..QQ
接下來挑幾項~ 個人覺得是重點的做記錄和google翻譯
DataTurbine 是即時的資料串流, 角色應該是定位在middle ware上, 其主要用Java的技術, 而且是一個open source project~~
這個project 本身包含Server與UI的部分...
Server就是處理資料串流, 傳遞/中繼
UI部分有提供另一個套件 rdv...是一個可執行的jar...
---------------------------------------
Introduction
DataTurbine is a robust real-time streaming data engine that lets you quickly stream live data from experiments, labs, web cams and even Java enabled cell phones. It acts as a "black box" to which applications and devices send and receive data. Think of it as express delivery for your data, be it numbers, video, sound or text.
DataTurbine is a buffered middleware, not simply a publish/subscribe system. It can an receive data from various sources (experiments, web cams, etc) and send data to various sinks. It has "TiVO" like functionality that lets applications pause and rewind live streaming data
DataTurbine is open source and free. There is also an active developer and user community that continues to evolve the software and assist in application development. This guide is designed as a first step to learning and deploying DataTurbine.
Why Use Data Turbine
Extendable: It is a free Open Source project with an extensive well documented API.
Scalable: It uses a hierarchical design that allows a network structure that grows with the requirements of your application
Portable: DataTurbine runs on devices ranging from phones & buoys to multicore servers.
Dependable Using a Ring Buffered Network Bus, it provides tunable persistent storage at key network nodes to facilitate reliable data transport
Community There is also an active developer and user community that continues to evolve the software and assist in application development.
Understanding DataTurbine
The Goal
Let’s say you have some data you’re collecting. Could be, say, weather data. Could be load readings from a bridge, pictures from a security camera, GPS-tagged biometrics from a tracked tiger, chlorophyll readings from a lake buoy, or pretty much anything else you can think of. Add in data from another system, and now stir in the requirement of multiple viewers. In other words, you have a system with lots of disparate data that you want to see, share and process.
DataTurbine is an excellent solution. It’s probably even answering needs that you didn’t know you had! Briefly, DataTurbine lets you stream data and see it in real-time. But it also lets you TiVo through old and new data, share it with anyone over the network, do real-time processing of the streams and more.
A Free Open Source Solution
In 2007, DataTurbine was transitioned from commercial to open source under the Apache 2.0 license. All code and documentation are public and available from the project web site. Current DataTurbine related research includes projects sponsored by NSF, NASA, and the Gordon and Betty Moore Foundation.
What DataTurbine Does Best
Reliable Data Transfer
Real-Time Data
Streaming
Analysis
Visualization
Publication
Cleanly works with heterogeneous data types
Separates data acquisition (sources) from data utilization (sinks)
Seamlessly access historical and real time data
Synchronized access across disparate data channels
What DataTurbine Is Not Good At
Replacing a database (DataTurbine should be used with a database)
Out of order data (Data is accepted chronologically)
Back-loading data
The Parts
DataTurbine consists of one or more servers accepting data from sources and serving them up to sinks. Each component can be located on the same machine or different machines, allowing for flexibility in the deployment.
---------------------------------------google 翻譯
介紹
DataTurbine是一個功能強大的實時數據流引擎,可以讓你快速現場實驗,實驗室,網絡攝像頭,甚至Java功能的手機的數據流。它作為一個“黑盒子”的應用程序和設備發送和接收數據。把它看成是快遞為您的數據,無論是數字,視頻,聲音或文本。
DataTurbine是一個緩衝的中間件,而不是簡單的發布/訂閱系統。它可以接收各種來源的數據(實驗,網絡攝像頭,等等),將數據發送到不同的水槽。它具有“TiVo的”一樣的功能,允許應用程序暫停和倒帶的實時數據流
DataTurbine是開源和免費。還有一個活躍的開發者和用戶社區,繼續發展軟件和協助的應用程序開發。本指南設計,作為第一步學習和部署DataTurbine的。
為什麼要使用數據渦輪
擴展性:它是一個免費的開源項目,具有廣泛的有據可查的API。
可擴展性:它採用了分層設計,使網絡結構,隨著您的應用程序的要求
便攜性:從手機設備上運行DataTurbine&浮標多核服務器。
可靠,使用環型緩衝網絡總線,在關鍵的網絡節點,它提供了可調的持久存儲,以方便可靠的數據傳輸
社區也有一個活躍的開發者和用戶社區,繼續發展軟件和協助的應用程序開發。
了解DataTurbine
目標
比方說,你有你收集一些數據。可以說,氣象數據。可以從橋上負荷讀數,從一個安全攝像頭的照片,GPS標記的生物識別技術從跟踪老虎,一個湖泊浮標的葉綠素讀數,或幾乎任何你能想到的。加在另一個系統中的數據,現在攪拌多個觀眾的要求。換句話說,你有一個系統,有很多不同的數據,你希望看到的,分享和處理。
DataTurbine是一個極好的解決方案。它甚至可能回答需要你不知道你有!簡言之,DataTurbine讓您流數據和實時看到它。但它也可以讓你通過TiVo的舊的和新的數據,它在網絡上的任何人分享,做流和實時處理。
一個免費的開源解決方案
2007年,DataTurbine轉變,從商業開放源碼基於Apache 2.0許可。所有的代碼和文件是公開的,可以從項目網站。電流DataTurbine相關研究項目由美國國家科學基金會,美國航空航天局,戈登和貝蒂·摩爾基金會主辦。
什麼DataTurbine做得最好
可靠的數據傳輸
實時數據
流
分析
可視化
發布
乾淨與異構數據類型
分隔數據採集(源)從數據的利用率(匯)
無縫地訪問歷史和實時數據
跨越不同的數據通道的同步訪問
什麼DataTurbine不擅長
更換一個數據庫(數據庫中應使用DataTurbine)
出於訂單數據(數據被接受的時間順序)
返回加載數據
零件
DataTurbine由一個或多個服務器接受的數據來源,並為他們提供服務的匯。位於每個組件都可以在同一機器上或不同的機器上,允許靈活部署中。
http://www.dataturbine.org/content/documentation
和強大的google翻譯...這個沒有wiki..QQ
接下來挑幾項~ 個人覺得是重點的做記錄和google翻譯
DataTurbine 是即時的資料串流, 角色應該是定位在middle ware上, 其主要用Java的技術, 而且是一個open source project~~
這個project 本身包含Server與UI的部分...
Server就是處理資料串流, 傳遞/中繼
UI部分有提供另一個套件 rdv...是一個可執行的jar...
---------------------------------------
Introduction
DataTurbine is a robust real-time streaming data engine that lets you quickly stream live data from experiments, labs, web cams and even Java enabled cell phones. It acts as a "black box" to which applications and devices send and receive data. Think of it as express delivery for your data, be it numbers, video, sound or text.
DataTurbine is a buffered middleware, not simply a publish/subscribe system. It can an receive data from various sources (experiments, web cams, etc) and send data to various sinks. It has "TiVO" like functionality that lets applications pause and rewind live streaming data
DataTurbine is open source and free. There is also an active developer and user community that continues to evolve the software and assist in application development. This guide is designed as a first step to learning and deploying DataTurbine.
Why Use Data Turbine
Extendable: It is a free Open Source project with an extensive well documented API.
Scalable: It uses a hierarchical design that allows a network structure that grows with the requirements of your application
Portable: DataTurbine runs on devices ranging from phones & buoys to multicore servers.
Dependable Using a Ring Buffered Network Bus, it provides tunable persistent storage at key network nodes to facilitate reliable data transport
Community There is also an active developer and user community that continues to evolve the software and assist in application development.
Understanding DataTurbine
The Goal
Let’s say you have some data you’re collecting. Could be, say, weather data. Could be load readings from a bridge, pictures from a security camera, GPS-tagged biometrics from a tracked tiger, chlorophyll readings from a lake buoy, or pretty much anything else you can think of. Add in data from another system, and now stir in the requirement of multiple viewers. In other words, you have a system with lots of disparate data that you want to see, share and process.
DataTurbine is an excellent solution. It’s probably even answering needs that you didn’t know you had! Briefly, DataTurbine lets you stream data and see it in real-time. But it also lets you TiVo through old and new data, share it with anyone over the network, do real-time processing of the streams and more.
A Free Open Source Solution
In 2007, DataTurbine was transitioned from commercial to open source under the Apache 2.0 license. All code and documentation are public and available from the project web site. Current DataTurbine related research includes projects sponsored by NSF, NASA, and the Gordon and Betty Moore Foundation.
What DataTurbine Does Best
Reliable Data Transfer
Real-Time Data
Streaming
Analysis
Visualization
Publication
Cleanly works with heterogeneous data types
Separates data acquisition (sources) from data utilization (sinks)
Seamlessly access historical and real time data
Synchronized access across disparate data channels
What DataTurbine Is Not Good At
Replacing a database (DataTurbine should be used with a database)
Out of order data (Data is accepted chronologically)
Back-loading data
The Parts
DataTurbine consists of one or more servers accepting data from sources and serving them up to sinks. Each component can be located on the same machine or different machines, allowing for flexibility in the deployment.
---------------------------------------google 翻譯
介紹
DataTurbine是一個功能強大的實時數據流引擎,可以讓你快速現場實驗,實驗室,網絡攝像頭,甚至Java功能的手機的數據流。它作為一個“黑盒子”的應用程序和設備發送和接收數據。把它看成是快遞為您的數據,無論是數字,視頻,聲音或文本。
DataTurbine是一個緩衝的中間件,而不是簡單的發布/訂閱系統。它可以接收各種來源的數據(實驗,網絡攝像頭,等等),將數據發送到不同的水槽。它具有“TiVo的”一樣的功能,允許應用程序暫停和倒帶的實時數據流
DataTurbine是開源和免費。還有一個活躍的開發者和用戶社區,繼續發展軟件和協助的應用程序開發。本指南設計,作為第一步學習和部署DataTurbine的。
為什麼要使用數據渦輪
擴展性:它是一個免費的開源項目,具有廣泛的有據可查的API。
可擴展性:它採用了分層設計,使網絡結構,隨著您的應用程序的要求
便攜性:從手機設備上運行DataTurbine&浮標多核服務器。
可靠,使用環型緩衝網絡總線,在關鍵的網絡節點,它提供了可調的持久存儲,以方便可靠的數據傳輸
社區也有一個活躍的開發者和用戶社區,繼續發展軟件和協助的應用程序開發。
了解DataTurbine
目標
比方說,你有你收集一些數據。可以說,氣象數據。可以從橋上負荷讀數,從一個安全攝像頭的照片,GPS標記的生物識別技術從跟踪老虎,一個湖泊浮標的葉綠素讀數,或幾乎任何你能想到的。加在另一個系統中的數據,現在攪拌多個觀眾的要求。換句話說,你有一個系統,有很多不同的數據,你希望看到的,分享和處理。
DataTurbine是一個極好的解決方案。它甚至可能回答需要你不知道你有!簡言之,DataTurbine讓您流數據和實時看到它。但它也可以讓你通過TiVo的舊的和新的數據,它在網絡上的任何人分享,做流和實時處理。
一個免費的開源解決方案
2007年,DataTurbine轉變,從商業開放源碼基於Apache 2.0許可。所有的代碼和文件是公開的,可以從項目網站。電流DataTurbine相關研究項目由美國國家科學基金會,美國航空航天局,戈登和貝蒂·摩爾基金會主辦。
什麼DataTurbine做得最好
可靠的數據傳輸
實時數據
流
分析
可視化
發布
乾淨與異構數據類型
分隔數據採集(源)從數據的利用率(匯)
無縫地訪問歷史和實時數據
跨越不同的數據通道的同步訪問
什麼DataTurbine不擅長
更換一個數據庫(數據庫中應使用DataTurbine)
出於訂單數據(數據被接受的時間順序)
返回加載數據
零件
DataTurbine由一個或多個服務器接受的數據來源,並為他們提供服務的匯。位於每個組件都可以在同一機器上或不同的機器上,允許靈活部署中。
訂閱:
文章 (Atom)