AdoptOS

Assistance with Open Source adoption

Open Source News

Overview: Talend Server Applications with Docker

Talend - Tue, 09/25/2018 - 08:53
Talend & Docker

Since the release of Talend 7, a major update in our software, users have been given the ability to build a complete integration flow in a CI/CD pipeline which allows to build Docker images. For more on this feature, I invite you to read the blog written by Thibault Gourdel on Going serverless with Talend through CI/CD and Containers.

Another major update is the support of Docker for server applications like Talend Administration Center (TAC). In this blog, I want to walk you through how to build these images. Remember, if you want to follow along, you can download your free 30-day trial of Talend Cloud here. Let’s get started!

Getting Started: Talend Installation Modes

In Talend provides two different installation modes when working with the subscription version. Once you received your download access to Talend applications, you have a choice:

  • Installation using the Talend Installer: The installer packages all applications and offers an installation wizard to help you through the installation.
  • Manual installation: Each application is available in a separate package. It requires a deeper knowledge of Talend installation, but it provides a lighter way to install especially for containers.

Both are valid choices based on your use case and architecture. For this blog, let’s go with manual installation because we will be able to define an image per application. It will be lighter for container layers and we will avoid overload these layers with unnecessary weight. For more information on Talend installation modes, I recommend you look at Talend documentation Talend Data Fabric Installation Guide for Linux (7.0) and also Architecture of the Talend products.

Container Images: Custom or Generic?

Now that we know a bit more about Talend Installation, we can start thinking about how we will build our container images. There are two directions when you want to containerize an application like Talend.  Going for a custom type image or a generic image:

  • A custom image embeds part of or a full configuration inside the build process. It means that when we will run the container, it will require less parameters than a generic image. The configuration will depend of the level of customization.
  • A generic image does not include specific configuration, it corresponds to a basic installation of the application. The configuration will be loaded at runtime.

To illustrate this, let’s look at an example with Talend Administration Center. Talend Administration Center is a central application in charge of managing users, projects and scheduling. Based on the two approaches for building an image of Talend Administration Centre:

  • A custom image can include:
    • A specific JDBC driver (MySQL, Oracle, SQL Server)
    • Logging configuration: Tomcat logging
    • properties: Talend Administration Centre Configuration
    • properties: Clustering configuration
  • A generic image
    • No configuration
    • Driver and configuration files can be loaded with volumes

The benefits and drawbacks of each approach will depend on your configuration, but :

  • A custom image:
    • Requires less configuration
    • Low to zero external storage required
    • Bigger images: more space required for your registry
  • A generic image
    • Lighter images
    • Reusability
    • Configuration required to run.
Getting Ready to Deploy

Once we have our images, and they are pushed to a registry, we need to deploy them. Of course, we can test them on a single server with a docker run command. But let’s face it, it is not a real-world use case. Today if we want to deploy a container application to on-premise or in the cloud, Kubernetes has become de facto the orchestrator to use. To deploy on Kubernetes, we can go with the standard YAML files or a Helm package. But to give a quick example and a way to test on a local environment, I recommend starting with a docker-compose configuration as in the following example:

 

version: '3.2' services:   mysql:     image: mysql:5.7     ports:     - "3306:3306"     environment:       MYSQL_ROOT_PASSWORD: talend       MYSQL_DATABASE: tac       MYSQL_USER: talend       MYSQL_PASSWORD: talend123     volumes:       - type: volume         source: mysql-data         target: /var/lib/mysql   tac:     image: mgainhao/tac:7.0.1     ports:     - "8080:8080"     depends_on:       - mysql     volumes:       - type: bind         source: ./tac/config/configuration.properties         target: /opt/tac/webapps/org.talend.administrator/WEB-INF/classes/configuration.properties       - type: bind         source: ./tac/lib/mysql-connector-java-5.1.46.jar         target: /opt/tac/lib/mysql-connector-java-5.1.46.jar volumes:   mysql-data:

The first MySQL service, creates a database container with one schema and a user tac to access it. For more information about the official MySQL image, please refer to: https://hub.docker.com/_/mysql/

The second service is my Talend Administration Centre image, aka TAC, a simplified version as it uses only the MySQL database. In this case, I have a generic image that is configured when you run the docker-compose stack.  The JDBC driver is loaded with a volume like the configuration.

In a future article, I’ll go in more details on how to build and deploy a Talend stack on Kubernetes. For now, enjoy building with Talend and Docker!

 

 

The post Overview: Talend Server Applications with Docker appeared first on Talend Real-Time Open Source Data Integration Software.

Categories: ETL

Red alert, shields up - The work of the Joomla Security Team

Joomla! - Tue, 09/25/2018 - 04:00

A CMS-powered website has all the ingredients for an IT security nightmare: it is publicly accessible, it’s running on powerful machines with great connectivity and the underlying system is used countless times around the globe, making it an attractive target for attackers.
The Joomla Security Strike Team (JSST) is working hard to make sure that this nightmare doesn’t become reality for Joomla users!

Categories: CMS

Red alert, shields up - The work of the Joomla Security Team

Joomla! - Tue, 09/25/2018 - 04:00

A CMS-powered website has all the ingredients for an IT security nightmare: it is publicly accessible, it’s running on powerful machines with great connectivity and the underlying system is used countless times around the globe, making it an attractive target for attackers.
The Joomla Security Strike Team (JSST) is working hard to make sure that this nightmare doesn’t become reality for Joomla users!

Categories: CMS

Vtiger recognised as “One to Watch” and voted 4th best in cloud CRMs!

VTiger - Tue, 09/25/2018 - 01:26
It’s always a delight to receive great feedback from our customers. It’s our way of knowing that our efforts are being acknowledged and loved. Excitingly enough, Vtiger was recently recognised as “One to Watch” in the mid-size market category by CRM Magazine as part of their 2018 CRM Market Awards. Also, we take great pleasure […]
Categories: CRM

Data Scientists Never Stop Learning: Q&A Spotlight with Isabelle Nuage of Talend

Talend - Mon, 09/24/2018 - 14:21

Data science programs aren’t just for students anymore. Now, data scientists can turn to open online courses and other resources to boost their skill sets. We sat down with Isabelle Nuage, Director of Product Marketing, Big Data at Talend to get insight on what resources are out there:

Q: How would you characterize the differences between data science research processes and machine learning deployment processes?

Isabelle: In the grand scheme of things, Data Science is Science. Data Scientists do a lot of iterations, through trial & error, before finding the right model or algorithm that fit their needs and typically work on sample data. When IT needs to deploy machine learning at scale, they’ll take the work from the data scientists and try to reproduce at scale for the enterprise. Unfortunately it doesn’t always work right away as sample data is different in that real life data has inconsistencies often missing values as well as other data quality issues.

Q: Why is putting machine learning (ML) models into production hard?

Isabelle: Data Scientists work in a lab mode, meaning they are often operating like lone rangers. They take the time to explore data, try out various models and sometimes it can take weeks or even months to deploy their data models into production. By that time, the models have already become obsolete for the business. Causing them to have to go back to the drawing board. Another challenge for Data Scientists is data governance, and without it data becomes a liability. A good example of this is in clinical trial data where sensitive patient information has to be masked so it is not accessible by everyone in the organization.

Q: What are the stumbling blocks?

Isabelle:  There is a lack of collaboration between the Data Science team and IT where each tend to speak their own language and have their own set of skills that the other might not understand. Data Science is often considered to be a pure technology discipline and not connected to business needs as the asks are often tied to the need for fast decision making in order to innovate and outsmart the competition. Existing landscapes, such as enterprise warehouses, are not flexible enough to enable Data Science teams access to all the historical and granular information as some data is stored on tapes. IT is needed to create a Data Lake in order to store all that historical data to train the models and add the real-time data enabling real-time decisions.

Q: How are enterprises overcoming them?

Isabelle: Enterprises are creating Cloud data lakes (better suited for big data volumes and processing) and leveraging the new services and tools such as serverless processing to optimize the cost of machine learning processing on big data volume. Additionally they are also creating a center of excellence to foster collaboration across teams as well as hiring a Chief Data Officer (CDO) to really elevate data science to a business discipline.

Q: What advice might you offer enterprises looking to streamline the ML deployment process?

Isabelle: Use tooling to automate the manual tasks such as hand-coding that foster collaboration between the Data Science and IT teams. By letting the Data Science team explore and do their research, but let IT govern and deploy data so it’s not a liability for the organization anymore. And doing this in a continuous iteration and delivery fashion will enable continuous smart decision making throughout the organization.

Q: What new programs for learning data science skills have caught your attention and in what ways do they build on traditional learning programs?

Isabelle: I’m most interested in new tools that democratize data science, provide a graphical, easy-to-use UI and suggest the best algorithms for the dataset, rather than going through a multitude of lengthy trials and errors. These tools make data science accessible to more people, like business analysts, so more people within the enterprise can benefit from the sophisticated advanced analytics for decision-making. These tools help people get a hands-on experience without needing a PhD.

Q: What are some of your favorite courses and certifications?

Isabelle: I’d say, Coursera as it offers online courses where people can learn at their own pace, they even offer some free data science and  free machine learning courses too. Another great option is MIT eLearning, which also offers course for Data Science and Big Data.

Check out Talend Big Data and Machine Learning Sandbox to get started.

 

The post Data Scientists Never Stop Learning: Q&A Spotlight with Isabelle Nuage of Talend appeared first on Talend Real-Time Open Source Data Integration Software.

Categories: ETL

Knowledge Sharing in Software Projects

Open Source Initiative - Mon, 09/24/2018 - 13:15

In a special guest post, Megan L. Endres, Professor of Management, at Eastern Michigan University provides a debrief of data gathering from a recent survey on Knowledge Sharing promoted by the OSI.

Thank you!

We are extremely grateful to those who filled out the survey. We feel that our research can help create better environments at work, where team members can share knowledge and innovate.

Purpose of the Study
Our research is focused on knowledge sharing in ambiguous circumstances. Six Sigma is a method of quality control that should reduce ambiguity, given its structured approach. We ask whether the reduction in ambiguity is coupled with a reduction in knowledge sharing as well.

Who responded?

A total of 54 people responded, of whom 58% had a bachelor’s degree and 26% had a master’s degree. Average of full-time work experience was 13.9 years, and average of managerial experience was 6.7 years.

Most respondents (53%) reported working in an organization with 400+ full-time employees, although a strong second (37%) reported working with 100 or fewer.

Most reported that they work on a team of 3 members (21%), although a large percentage work on teams with 4 members (18.4%) and 5 members (13.2%). The complexity of the team tasks was moderately high, rated 3.66 on a 1 to 5 scale (least to most complex) (s.d. = 1.05).

Knowledge and Sharing

Respondents believed they brought considerable expertise to their team projects, which could be a result of good team assignments according to knowledge and skill. The average expertise reported was 4.13, on a scale of 1 (very low) to 5 (very high) (s.d. = 0.99).

Important variables we gathered are below with the mean and standard deviation. These are the average of a set of questions that was tested for reliability and averaged. It is important to note that standard deviations are all about 1 and, given a 5-point scale, this indicates general agreement among those who responded. The average of these variables was the same for varied years of experience, years of management, size of company, and level of education.

Variable Mean I share knowledge on my teams 4.35 My team shares knowledge with me 3.51 Knowledge sharing is valuable 4.43 My teams are innovative/creative 4.05 I have clear goals/feedback 3.17






 

 

 

Relationships in the Data

We will be gathering more data in order to perform more complex data analysis, but correlations show relationships that may prove to be important.

Significant relationships include:

  • Higher self-reported knowledge sharing is related to more clear goal setting at work, more innovative teamwork, and positive knowledge sharing attitudes. This is not surprising because an environment with positive knowledge sharing has better communication between team members and, therefore, clarifications are more likely when goals aren’t clear. Those who worked for larger organizations (400+ employees) said that their goal setting was clearer. This also is not surprising because more formal structure in the organization probably is associated with formal performance reviews and procedures.
  • Higher team knowledge sharing is associated with less likelihood one will have a Six Sigma belt and with lower Six Sigma knowledge. This may indicate that knowledge sharing, and Six Sigma are negatively related, but until a larger sample of responses is gathered, this is only a proposition.
  • The open source software questions did not reveal important information so far. That is because you are a part of a sample that uniformly has positive attitudes toward open source (in general). Others will fill out the survey in the future who are not affiliated with open source groups and variation in the responses will allow us to study relationships with other data.

Megan L. Endres, Professor of Management, Eastern Michigan University

Knowledge Sharing in Software Projects, by Megan L. Endres, CC-BY 4.0. 2018

Knowledge-sharing, by Ansonlobo. CC BY-SA 4.0. 2016. From Wikimedia Commons.

Categories: Open Source

Membership Renewal online class - Wednesday, September 26th

CiviCRM - Sun, 09/23/2018 - 16:51

Join Cividesk on Wednesday, September 26th at 9 am PT/ 10 am MT/ 11 am CT/ 12 pm ET for this informative 2 hour online session that will cover best practices for organizing membership renewals in CiviCRM.  This class is recommended as the next step in your CiviCRM training for those who have taken the "Fundamentals of Membership Management" session. 

Categories: CRM

The 2018 Strata Data Conference and the year of machine learning

SnapLogic - Fri, 09/21/2018 - 15:10

Recently, I represented SnapLogic at the September 2018 Strata Data Conference at the Javits Center in New York City. This annual event is a good indication of trends in data – big and otherwise. If 2017 was the year of Digital Transformation, 2018 is the year of machine learning. Many of the exhibitor’s booth themes[...] Read the full article here.

The post The 2018 Strata Data Conference and the year of machine learning appeared first on SnapLogic.

Categories: ETL

What’s In Store for the Future for Master Data Management (MDM)?

Talend - Fri, 09/21/2018 - 12:37

Master Data Management (MDM) has been around for a long time, and many people like myself, have been involved in MDM for many years. But, like all technologies, it must evolve to be successful. So what are those changes likely to be, and how will they happen?

In my view, MDM is and will change in important two ways in the coming years. First, there will be technological changes, such as moving MDM into the cloud or moving into more ‘software as a service’, or SaaS, offerings, which will change the way MDM systems are built and operated. Secondly, there are and will be more fundamental changes within MDM itself, such as moving MDM from single domain models into truly multi-domain models. Let’s look at these in more details.

Waves of MDM Change: Technical and Operational

New and disruptive technologies make fundamental changes to the way we do most things. In the area of MDM, I expect that to change in two main areas. First comes the cloud. In all areas that matter in data we are seeing moves into the cloud and I expect MDM to be no different. The reasons are simple and obvious, the move towards MDM being offered as a SaaS offering brings cost savings in build, support, operation, automation and maintenance and is therefore hugely attractive to all businesses. I expect that going forward we will see MDM more and more being offered as a SaaS.

The second area I see changes happening are more fundamental. Currently, many MDM systems concentrate on single-domain models. This is the way it has been for many years and currently manifests itself in the form of a ‘customer model’ or a ‘product model’. Over time I believe this will change. More and more businesses are looking towards multi-domain models that will, for example, allow models to be built that have the links between customer and partners, products, suppliers etc. This is the future for MDM models, and already at Talend, our multi-domain MDM tool allows you to build models of any domain you choose. Going forward, its clear that linking those multi-domain models together will be the key.

Watch The 4 Steps to Become a Master at Master Data Management now.
Watch Now MDM and Data Matching

Another area of change that is on the way is in regards to how MDM systems do matching. Currently, most systems do some type of probabilistic matching on properties within objects. I believe the future will see more of these MDM systems doing ‘referential matching’. By this, I mean making more use of the reference database, which may contain datasets like demographic data, in order to do better data matching. Today, many businesses use data that is not updated often enough and so becomes of less and less value. Using external databases to say, get the updated address of your customer or supplier, should dramatically change the value of your matching.

Machine Learning to the Rescue

The final big area of change coming in the future for MDM is the introduction of intelligence or machine learning. In particular, I forecast we will see intelligence in the form of machine learning survivorship. This will like take the form of algorithms which ‘learn’ how records survive and will, therefore, use these results to make predictions about which records survive, and which don’t. this will free up a lot of time for the data steward. 

Conclusion

Additional changes will likely also come around the matching of non-western names and details (such as addresses). At the moment they can be notoriously tricky as, for example, algorithms such as Soundex simply can’t be applied to many languages. This will change and we should see support for more and more languages.

One thing I am certain of though, many of these areas I mentioned are being worked on, all vendors will likely make changes in these areas and Talend will always be at the forefront of development in the future of Master Data Management. Do you have any predictions for the future of MDM? Let me know in the comments below.

Learn more about MDM with Talend’s Introduction to Talend’s Master Data Management tutorial series, and start putting its power to use today!

 

The post What’s In Store for the Future for Master Data Management (MDM)? appeared first on Talend Real-Time Open Source Data Integration Software.

Categories: ETL

Great CiviCRM meetup in Antwerp!

CiviCRM - Fri, 09/21/2018 - 07:45

On Tuesday 11 September 2018 we had quite an enjoyable and interesting CiviCRM meetup in Antwerp.

     

A total of 32 participants gathered at the University of Antwerp Middelheim campus to share their stories around CiviCRM.

Categories: CRM

Introducing Diet Civi

CiviCRM - Fri, 09/21/2018 - 07:21

Diet Civi is a new "working" group within the CiviCRM community.

 

Our main objectives are to

  • improve CiviCRM’s ability to support a variety of different workflows on the common core data model.

  • define, coordinate, foster, and fund projects to achieve this.

 

Categories: CRM

API World recap and self-service integration with AI

SnapLogic - Thu, 09/20/2018 - 16:57

It’s been about a week since I attended API World in San Jose, California to present an hour-long keynote, “Supercharging Self-Service API Integration with AI,” and I wanted to share some of my takeaways from this great conference. For those who were unable to attend, I shared SnapLogic’s journey with machine learning (ML), going from[...] Read the full article here.

The post API World recap and self-service integration with AI appeared first on SnapLogic.

Categories: ETL

Why Organizations Are Choosing Talend vs Informatica

Talend - Thu, 09/20/2018 - 15:10

Data has the power to transform businesses in every industry from finance to retail to healthcare. 2.5 quintillion bytes of data are created every day, and the volume of data is doubling each year. The low cost of sensors, ubiquitous networking, cheap processing in the cloud, and dynamic computing resources are not only increasing the volume of data, but the enterprise imperative to do something with it. Plus, not only is there more data than ever, but also more people than ever who want to work with it to create business value.

Businesses that win today are using data to set themselves apart from the competition, transform customer experience, and operate more efficiently. But with so much data, and so many people who want to use it, extracting business value out of it is nearly impossible — and certainly not scalable — without data management software. But what is the right software to choose? And what criteria should data and IT professionals be looking for when selecting data integration software?

In a recent survey undertaken with TechValidate of 299 Talend users, respondents expressed clear preferences for data integration tools that had the following characteristics:

• Flexibility. Respondents wanted data integration tools to be able to connect to any number of data sources, wherever they happen to be located (in the cloud, on-premises, in a hybrid infrastructure, etc.)
• Portability. Respondents wanted the ability to switch deployment environments with the touch of a button.
• Ease of Use. Respondents want their data integration tools to be easy to use, with an intuitive interface.

Large majorities of survey respondents who selected Talend vs Informatica made their choice based on those factors.

Talend vs Informatica: A Common Choice

Data and IT professionals have numerous choices when deciding how to manage their enterprise data for business intelligence and analytics. Among our customers surveyed, we found the most common choice was between Talend and Informatica, followed by Talend vs hand coding.

The reasons to use a tool over traditional hand-coded approaches to data processes like ETL are numerous; interestingly, survey respondents that chose Talend over hand-coding see productivity gains of 2x or more.

 

var protocol = (("https:" == document.location.protocol) ? "https://" : "http://"); document.write(unescape("%3Cscript src='" + protocol + "www.techvalidate.com/assets/embed.js' type='text/javascript'%3E%3C/script%3E"));
var tvAsset_ACEC4F8B3 = new TVAsset(); tvAsset_ACEC4F8B3.initialize({width: 610, height: 338, 'style': 'transparent', 'tvid':'ACEC4F8B3', 'protocol':document.location.protocol}); tvAsset_ACEC4F8B3.display();

They also find that their maintenance costs are reduced when they use Talend over hand coding. Clearly, choosing a data management tool like Talend over hand-coding data integrations is the right choice. But when organizations are trying to decide between tools, what factors are they considering?

Talend vs Informatica: Talend is More Flexible and Easier to Use

As we’ve seen, customers that chose Talend over Informatica cited Talend’s flexibility, portability, and ease of use as differentiating factors. These factors were particularly important to customers who chose Talend vs Informatica. In fact, 95% of these customers said that Talend’s flexibility and open source architecture distinguished it from the competition. In addition, 90% of them cited portability as a competitive differentiator, and 85% of them noted that ease of use distinguished Talend from the competition as well.

Given the increased impact that cloud data integration is having on the data management landscape, these factors make sense. The increasing amount of data in a wide variety of environments must to be processed and analyzed efficiently; in addition, there is an enterprise necessity to be able to change cloud providers and servers as easily as possible. Therefore, flexibility and portability gain greater importance. You don’t want your data management tools to hold you back from your digital transformation goals. Plus, with the greater number of people wanting access to data, having tools that are easy to use becomes very important to provide access to all the lines of business who want and need data for their analytics operations.

Talend: A Great Choice for Cloud Data Integration

Customers who are using Talend find its open-source architecture and collaborative tools useful for a number of business objectives, including using data to improve business efficiency and improving data governance.

 

var protocol = (("https:" == document.location.protocol) ? "https://" : "http://"); document.write(unescape("%3Cscript src='" + protocol + "www.techvalidate.com/assets/embed.js' type='text/javascript'%3E%3C/script%3E"));
var tvAsset_555130D82 = new TVAsset(); tvAsset_555130D82.initialize({width: 600, height: 470, 'style': 'transparent', 'tvid':'555130D82', 'protocol':document.location.protocol}); tvAsset_555130D82.display();

Talend has proved extremely useful in helping organizations get true value out of their data. One customer noted:

 

var protocol = (("https:" == document.location.protocol) ? "https://" : "http://"); document.write(unescape("%3Cscript src='" + protocol + "www.techvalidate.com/assets/embed.js' type='text/javascript'%3E%3C/script%3E"));
var tvAsset_208C13074 = new TVAsset(); tvAsset_208C13074.initialize({width: 610, height: 406, 'style': 'transparent', 'tvid':'208C13074', 'protocol':document.location.protocol}); tvAsset_208C13074.display();

If you’re considering using data management software, why not try Talend FREE for 30 days and see what results you can achieve for your business. Data can be truly transformative. Harness it with an open-source, scalable, easy-to-manage tool.

The post Why Organizations Are Choosing Talend vs Informatica appeared first on Talend Real-Time Open Source Data Integration Software.

Categories: ETL

Lines, Splines, and Bars, Oh My!

Liferay - Thu, 09/20/2018 - 13:59

Charts are great visual aids. They're crucial to digesting data and information quickly. Prior to Liferay Portal 7.1, we didn't have a way to include charts in our apps. No more! Now you can use charts out-of-the-box, thanks to the Chart taglib. Whether it's bar charts, line charts, pie charts, scatter charts, spline charts, or even donut charts, the Chart taglib has you covered. I like a good old fashioned pie myself: Apple, Blueberry, Pumpkin, you name it. They're delicious, and so are these charts.

This blog post gives an overview of how charts work in Liferay and links to our official documentation, so you can dive deeper and go chart crazy!

How Does it work?

Using the Chart taglib is pretty straight forward: You provide the data, and then configure the chart to digest the data in the JSP. Data can be written in a Java class, a JSON file, an array of numbers, etc. 

A Few Good Examples

Alright, so you have an idea of how charts work in Liferay, but what about an actual example? No problem. Let's start with the oh so delicious pie chart.

Pie Chart Example

Pie charts display percentage-based slices that represent data. This example sets up some sample data via a Java class, and then feeds that data into the Pie chart tag's config attribute.

Java Sample Data:

public class ChartDisplayContext { public pieChartConfig getPieChartConfig() {   PieChartConfig _pieChartConfig = new PieChartConfig();   _pieChartConfig.addColumns(   new SingleValueColumn("data1", 30),   new SingleValueColumn("data2", 70)   ); return _pieChartConfig; } }

JSP:

<chart:pie config="<%= chartDisplayContext.getPieChartConfig() %>" id="pie" />

The resulting Pie chart:

As, you can see, configuring the chart is fairly easy. Now, let's take a look at a combination chart.

Combination Chart Example

Combination charts let you visualize multiple types of data in one chart. Simply specify the representation type of each data set. The example below also makes use of grouping. Data 1 and Data 2 are grouped together within the same bar.

Java sample data:

public class ChartDisplayContext { private CombinationChartConfig getCombinationChartConfig() {   CombinationChartConfig _combinationChartConfig = new CombinationChartConfig(); _combinationChartConfig.addColumns( new TypedMultiValueColumn( "data1", Type.BAR, 30, 20, 50, 40, 60, 50), new TypedMultiValueColumn( "data2", Type.BAR, 200, 130, 90, 240, 130, 220), new TypedMultiValueColumn( "data3", Type.SPLINE, 300, 200, 160, 400, 250, 250), new TypedMultiValueColumn( "data4", Type.LINE, 200, 130, 90, 240, 130, 220), new TypedMultiValueColumn( "data5", Type.BAR, 130, 120, 150, 140, 160, 150), new TypedMultiValueColumn( "data6", Type.AREA, 90, 70, 20, 50, 60, 120)); _combinationChartConfig.addGroup("data1", "data2"); return _combinationChartConfig; } }

JSP:

<chart:combination config="<%= chartDisplayContext.getCombinationChartConfig() %>" id="combination" />

The resulting combination chart:

so far, we've looked at static charts. Let's see how we can update charts to reflect real time data.

Real Time Data Example

Charts can reflect static or real time data, such as that fed in from a JSON file that changes periodically. This is made possible via each chart's optional polling interval property. It specifies the time in milliseconds for the chart's data to refresh.  To set the interval polling property, use the setPollingInterval() method.

Java sample data:

public class MyBarChartDisplayContext { public BarChartConfig getBarChartConfig() { BarChartConfig _barChartConfig = new BarChartConfig(); _barChartConfig.addColumns( new MultiValueColumn("data1", 100, 20, 30), new MultiValueColumn("data2", 20, 70, 100)); _barChartConfig.setPollingInterval(2000); return _barChartConfig; } }

The real time data is simulated in the JSP via a promise that resolves when the chart component is loaded:

<chart:bar componentId="polling-interval-bar-chart" config="<%= myBarChartDisplayContext.getBarChartConfig() %>" id="polling-interval-bar-chart" /> <aui:script> Liferay.componentReady('polling-interval-bar-chart').then( function(chart) { chart.data = function() { return Promise.resolve( [ { data: [Math.random() * 100, Math.random() * 100, Math.random() * 100], id: 'data1' }, { data: [Math.random() * 100, Math.random() * 100, Math.random() * 100], id: 'data2' } ] ); }; } ); </aui:script>

The resulting real time bar chart (looped for effect):

Geomap Chart Example

A Geomap Chart lets you visualize data based on geography, given a specified color range–a lighter color representing a lower rank and a darker a higher rank usually. This example ranks the geography based on the location’s name_len value (specified in the geomap’s JSON file). The geomap is based on the length of each location’s name, as specified with the line geomapColor.setValue("name_len");. The setValue() method defines which JSON property is applied to the geomap. The JSON filepath is specified with the setDataHREF() method. The example below uses custom colors.

Java sample data:

public class ChartDisplayContext { public GeomapConfig getGeomapConfig() { GeomapConfig _geomapConfig = new GeomapConfig(); GeomapColor geomapColor = new GeomapColor(); GeomapColorRange geomapColorRange = new GeomapColorRange(); geomapColorRange.setMax("#b2150a"); geomapColorRange.setMin("#ee3e32"); geomapColor.setGeomapColorRange(geomapColorRange); geomapColor.setSelected("#a9615c"); geomapColor.setValue("name_len"); _geomapConfig.setColor(geomapColor); String href = "https://mydomain.com/myservice/geomap.geo.json"; _geomapConfig.setDataHREF(href); return _geomapConfig; } }

The JSP not only points to the data, but also includes styling for the geomap SVG in this case:

<style type="text/css"> .geomap { margin: 10px 0 10px 0; } .geomap svg { width: 100%; height: 500px !important; } </style> <chart:geomap config="<%= chartDisplayContext.getGeomapConfig() %>" id="geomap-custom-colors" />

Resulting geomap:

No longer does that burning question have to keep you up at night: 
Where in the World is Carmen Sandiego? 

Thanks to Liferay's Chart's, now we can use real time data to track her whereabouts on our Geomap:

This blog post  gave a brief overview of how to use charts in Liferay, using simplified code examples. Check out our official documentation on dev.liferay.com for complete examples and information. Thanks for reading!

 

Michael Williams 2018-09-20T18:59:00Z
Categories: CMS, ECM

California’s First Open Source Election System: Maybe not!

Open Source Initiative - Thu, 09/20/2018 - 09:28

OSI Affiliate Member, California Association of Voting Officials (CAVO), has expressed concerns that a recent announcement by Los Angeles County Registrar-Recorder/County Clerk (Dean Logan) and the State of California's Secretary of State (Alex Padilla) was not accurate in their descriptions of a newly certified elections tally system, "Voting System For All People" (VSAP), as using "open source technology."

Both the Los Angeles County and California Secretary of State announcements stated the elections system was, "the first publicly-owned, open-source election tally system certified under the California voting systems standards" [emphasis added].

Initially, the OSI expressed praise for the announcements from California,

@CountyofLA's vote tally system is California’s first certified #elections system to use #opensource technology. This publicly-owned technology represents a significant step in the future of elections in California and across the country. https://t.co/GZ3aWZgu83 pic.twitter.com/Vn66CtplgP

— OpenSourceInitiative (@OpenSourceOrg) August 27, 2018

The announcement appeared to be the culmination of several years of work by LA County in developing an open source voting system. Yet almost immediately after the news broke of the open source election tally system, doubts were raised. StateScoop reported, Los Angeles County's new 'open source' vote tallying system isn't open source just yet, The StateScoop article included a comment by John Sebes, chief technology officer of the Open Source Election Technology Institute, "My takeaway is that their intention is to make it freely available to other organizations, but today it's not. It's open source in the sense that it was paid for by public funds and the intent is to share it."  In a comment to the OSI, Tim Mayer, President of CAVO ofered, "Los Angeles County must share their code publicly now. They have a history of not collaborating with the open source voting pioneers and community members. In order for it to be open source they must meet the standards."

Chris Jerdonek, San Francisco Elections Commissioner and Chair of San Francisco's Open Source Voting System Technical Advisory Committee, requested a copy of the source code for VSAP. In response, while LA County, "determined that there are responsive records to [Jerdonek's] request," the county stated that the records are exempt from disclosure as the records:

  • are, "prohibited from disclosure by federal or state-law",
  • relate to, "information technology systems of a public agency", and
  • "the facts of the particular case dictate that the public interest served by not disclosing the record clearly outweighs the public interest served by disclosure of the record."

All three of these responses conflict with global expectations of software described as open source, and contradict the specific benefits (i.e. "Linus's Law") extolled by Padilla and Logan for developing an open source elections system...

With security on the minds of elections officials and the public, open-source technology has the potential to further modernize election administration, security, and transparency.”
- Secretary of State, Alex Padilla.

We observed what took place in the last decade with this heightened awareness and sensitivity to voting technology at the same time as this kind of evolution of open-source.
- LA County Registrar-Recorder/County Clerk, Dean Logan

“Open source software” is a defined term, that is, software distributed with an OSI-Approved Open Source License. Each of these licenses are certified based on the Open Source Definition. The OSI's License Review Process guarantees software freedom though approved licensees, providing "permission in advance" to study, use, modify and redistribute the software.

For the Open Source Initiate, our concerns revolve around the apparent lack of regard for the open source label by county and state officials―its affordances and value―although perhaps the current state of the project is simply due to a lack of experience with, or in, open source communities of practice. Authenticity in principles and practice is of the utmost importance to the OSI in our efforts to promote and protect open source software, development and communities. Misuse (innocent or nefarious) dilutes the value, weakens trust, confuses the public, and reduces the efficacy of open source licensed software.

Both CAVO and the OSI have requested from the Los Angeles County Registrar-Recorder/County Clerk, the open source software code and the OSI-Approved Open Source License distributed with the related project certification. CAVO has also requested a web link to a demonstration site and other surrounding information. As of today, neither organization has received a response from LA County, although the OSI has been assured a reply is forthcoming.

"We want to assure the open source community that Los Angeles' representations are being addressed with appropriate scrutiny," stated CAVO Secretary Brent Turner. "We will not allow 'open-washing' to interfere with our efforts toward the national security."

Although concerned at this point with the communications around, the "first publicly owned, open source election tally system certified under the California voting systems standards," we at the OSI are extremely enthusiastic that there is apparently interest and efforts underway to deliver open source voting systems. We are hopeful that these initial shortcomings are simply gaps in process and practice inherent to bureaucracies and operations as they evolve, adopt new technologies, and update policies.

The OSI and CAVO stand ready, and offer our support and expertise to Los Angeles County and the State of California to help develop, deploy and build community around their elections software.

Categories: Open Source

The data journey: From the data warehouse to data marts to data lakes

SnapLogic - Wed, 09/19/2018 - 15:59

With data increasingly recognized as the corporate currency of the digital age, new questions are being raised as to how that data should be collected, managed, and leveraged as part of an overall enterprise data architecture. Data warehouses: Model of choice For the last few decades, data warehouses have been the model of choice, used[...] Read the full article here.

The post The data journey: From the data warehouse to data marts to data lakes appeared first on SnapLogic.

Categories: ETL

Meet Martin, Ambassador of the month | September 2018

PrestaShop - Wed, 09/19/2018 - 09:27
Martin is our PrestaShop Ambassador in Brussels, Belgium, and he is the lucky Ambassador of the month! Congratulations to him!
Categories: E-commerce

The Big Data Debate: Batch vs. Streaming Processing

Talend - Tue, 09/18/2018 - 12:48

While data is the new currency in today’s digital economy, it’s still a struggle to keep pace with the changes in enterprise data and the growing business demands for information. That’s why companies are liberating data from legacy infrastructures by moving over to the cloud to scale data-driven decision making. This ensures that their precious resource— data — is governed, trusted, managed and accessible.

While businesses can agree that cloud-based technologies are key to ensuring the data management, security, privacy and process compliance across enterprises, there’s still an interesting debate on how to get data processed faster — batch vs. stream processing.

Each approach has its pros and cons, but your choice of batch or streaming all comes down to your business use case. Let’s dive deep into the debate to see exactly which use cases require the use of batch vs. streaming processing.

Batch vs. Stream Processing: What’s the Difference?

A batch is a collection of data points that have been grouped together within a specific time interval. Another term often used for this is a window of dataStreaming processing deals with continuous data and is key to turning big data into fast data. Both models are valuable and each can be used to address different use cases. And to make it even more confusing you can do windows of batch in streaming often referred to as micro-batches.

While the batch processing model requires a set of data collected over time, streaming processing requires data to be fed into an analytics tool, often in micro batches, and in real-time. Batch processing is often used when dealing with large volumes of data or data sources from legacy systems, where it’s not feasible to deliver data in streams. Batch data also by definition requires all the data needed for the batch to be loaded to some type of storage, a database or file system to then be processed. At times, IT teams may be idly sitting around and waiting for all the data to be loaded before starting the analysis phase.

Data streams can also be involved in processing large quantities of data, but batch works best when you don’t need real-time analytics. Because streaming processing is in charge of processing data in motion and providing analytics results quickly, it generates near-instant results using platforms like Apache Spark and Apache Beam. For example, Talend’s recently announced Talend Data Streams, is a free, Amazon marketplace application, powered by Apache Beam, that simplifies and accelerates ingestion of massive volumes and wide varieties of real-time data.

Is One Better Than the Other?

Whether you are pro-batch or pro-stream processing, both are better when working together. Although streaming processing is best for use cases where time matters, and batch processing works well when all the data has been collected, it’s not a matter of which one is better than the other — it really depends on your business objective.

Watch Big Data Integration across Any Cloud now.
Watch Now

However, we’ve seen a big shift in companies trying to take advantage of streaming. A recent survey of more than 16,000 data professionals showed the most common challenges to data science including everything from dirty data to overall access or availability of data. Unfortunately, streaming tends to accentuate those challenges because data is in motion. Before jumping into real-time, it is key to solve those accessibility and quality data issues.   

When we talk to organizations about how they collect data and accelerate time-to-innovation, they usually share that they want data in real-time, which prompts us to ask, “What does real-time mean to you?” The business use cases may vary, but real-time depends on how much time to the event creation or data creation relative to the processing time, which could be every hour, every five minutes or every millisecond.

To draw an analogy for why organizations would convert their batch data processes into streaming data processes, let’s take a look at one of my favorite beverages—BEER. Imagine you just ordered a flight of beers from your favorite brewery, and they’re ready for drinking. But before you can consume the beers, perhaps you have to score them based on their hop flavor and rate each beer using online reviews. If you know you have to complete this same repetitive process on each beer, it’s going to take quite some time to get from one beer to the next. For a business, the beer translates into your pipeline data. Rather than wait until you have all the data for processing, instead you can process it in micro batches, in seconds or milliseconds (which means you get to drink your beer flight faster!).

Why Use One Over the Other?

If you don’t have a long history working with streaming processing, you may ask, “Why can’t we just batch like we used to?” You certainly can, but if you have enormous volumes of data, it’s not a matter of when you need to pull data, but when you need to use it.

Companies view real-time data as a game changer, but it can still be a challenge to get there without the proper tools, particularly because businesses need to work with increasing volumes, varieties and types of data from numerous disparate data systems such as social media, web, mobile, sensors, the cloud, etc. At Talend, we’re seeing enterprises typically want to have more agile data processes so they can move from imagination to innovation faster and respond to competitive threats more quickly. For example, data from the sensors on a wind turbine are always-on. So, the stream of data is non-stop and flowing all the time. A typical batch approach to ingest or process this data is obsolete as there is no start or stop of the data. This is a perfect use case where stream processing is the way to go.

The Big Data Debate

It is clear enterprises are shifting priorities toward real-time analytics and data streams to glean actionable information in real time. While outdated tools can’t cope with the speed or scale involved in analyzing data, today’s databases and streaming applications are well equipped to handle today’s business problems.

Here’s the big takeaway from the big data debate: just because you have a hammer doesn’t mean that’s the right tool for the job. Batch and streaming processing are two different models and it’s not a matter of choosing one over the other, it’s about being smart and determining which one is better for your use case.

The post The Big Data Debate: Batch vs. Streaming Processing appeared first on Talend Real-Time Open Source Data Integration Software.

Categories: ETL

Installing to SQL Server Using Windows Integrated Authentication

Liferay - Tue, 09/18/2018 - 11:10

This is a quick post on installing Liferay DXP to a SQL Server database.

The Liferay documentation does include the database properties that make up the connection string. But things weren’t as clear when SQL Server is setup for windows integrated authentication, hence this post.

The steps below are presented assuming you are setting up a brand new Liferay installation and pointing it to a SQL Server database using the initial Basic Configuration page that comes up following the first startup of a Liferay server. You could use portal-ext.properties to specify the connection URL provided below.

  1. Extract the Liferay DXP bundle.
  2. Add the jdbc driver jar to the classpath.
    1. Download the Sql Server JDBC Driver and copy the relevant jar to your tomcat/lib/ext/.
  3. Start the server.
  4. When presented with the Basic Configuration screen, click Change to change the database from the hsql that ships with Liferay.
  5. Tweak your connection URL. This is important. Note the integratedSecurity=true.
    1. jdbc:sqlserver://hostname;databaseName=databasename;integratedSecurity=true;
  6. Clear your username and password. You’re using Windows integrated authentication, so you don’t need the database user.
  7. That’s not all though. If you attempt to test your configuration by clicking Finish with the above, you will see an error in the logs.
    This driver is not configured for integrated authentication.
    ...
    ...
    Caused by: java.lang.UnsatisfiedLinkError: no sqljdbc_auth in java.library.path
  8. sqljdbc_auth.dll is the missing piece. You should be able to find that DLL file in an auth subfolder of your JDBC Driver download. Place that DLL file somewhere on your file system. IMPORTANT: THIS IS DEFICIENT. SEE COMMENTS AFTER READING. Then make these two changes to your catalina.bat. 
    1. Add the below line just before rem Execute Java with the applicable properties.
      rem Set the Java library path to help the JVM locate sqljdbc_auth.dll for integrated authentication
      set JAVA_LIB_PATH=c:\app\drivers
    2. After rem Execute Java with the applicable properties, there should be a few calls to run tomcat in different execution scenarios. For each of those calls, insert the JVM parameter, java.library.path, and assign the value of the environment variable %JAVA_LIB_PATH% as its argument. One such call is shown below with the new JVM argument shown in bold.
      %_EXECJAVA% %LOGGING_CONFIG% %LOGGING_MANAGER% %JAVA_OPTS% %CATALINA_OPTS% %DEBUG_OPTS% -D%ENDORSED_PROP%="%JAVA_ENDORSED_DIRS%" -classpath "%CLASSPATH%" -Dcatalina.base="%CATALINA_BASE%" -Dcatalina.home="%CATALINA_HOME%" -Djava.io.tmpdir="%CATALINA_TMPDIR%" -Djava.library.path=%JAVA_LIB_PATH% %MAINCLASS% %CMD_LINE_ARGS% %ACTION%

      Make sure you tweak all the Tomcat execution scenarios.

  9. Restart Liferay.

Javeed Chida 2018-09-18T16:10:00Z
Categories: CMS, ECM

What is Social Shopping and How it Will Affect E-commerce in China

PrestaShop - Tue, 09/18/2018 - 09:54
If you are not yet familiar with the term Social shopping, get ready to hear a lot about it.
Categories: E-commerce
Syndicate content