Here is a brief list of the methodology I have seen people do to execute various analytics projects- it includes a mix of the formal methods of KDD, SEMMA ,CRISP- DM and the intuitive and unwritten practicalities of dealing with a large amount of data in a project environment which is fast paced and error -intolerant.
To know more about KDD, SEMMA and CRISP-DM you can refer to the visual diagrams at http://www.decisionstats.com/visual-guides-to-crisp-dm-kdd-and-semma/
1) Business Phase- done by Business Consultants
This deals with gathering requirements , asking relevant questions from the customers and users of data ,creators and maintainers of data storage, and formal and informal rules that affect data quality, business operations, budgeting issues, people and processes prevalent in the relevant organization. A review of previous efforts on the same project type is essential to understanding what works and what does not, in that business domain.
At the end of this phase, you should have a broad project plan ready which includes what kinds of data to be asked, what indicative time-lines can be committed, impacts on costs and lift in revenue estimated, and any constraints and assumptions that need to specified out. It clearly should mention what is in scope and out scope of the analytics project and set expectations from all stakeholders transparently.
2) Data Phase-done by IT Personnel
This phase requires demanding, transmitting (in a secure and efficient manner, preferably encrypted and compressed), receiving, and validating data quality and data integrity. It should also note limitations of legal restrictions on data sharing, masking critical parts of the data, avoiding data mangling during retrieval and transport, and adequate checks and communication. It may involve sampling of whole data population if there is such a need, and further checking to ensure the sample is both adequate and truly representative. Data Manipulation may be a necessary part of this, and it may involve keeping and dropping data depth or breadth, and setting of coherent layouts and formats is critical to a timely data phase.
At the end of this phase, you should have an adequately prepared dataset, series of datasets that can be used for reporting, pattern finding, forecasts,modeling or optimization as the need may be.
3) Analysis Phase-done by IT and Statistical Analysts
This phase involves a specialized software to aggregate, and model the datasets prepared to give the analysis. It may be done by automated means, but it usually requires both IT and Statistical Analysts to work together
4) Presentation Phase-done by Business Analysts – The end of an analytics project is usually a spreadsheet or a presentation with an attached document to explain the assumptions.
Care must be taken by the business analysts to balance both technical knowledge, ease of understanding by business audience and caveats and assumptions to various scenarios and hypothesis that the analysis is functional in.
Note that business analysts , statistical analysts, IT personnel and consultants refer to roles, and they can be played by the same people if they have been trained in those distinct but sometimes overlapping skills.