PM4Py로 alignment based conformance checking하는 법

이번 포스팅에서는 PM4Py로 가장 핫한 conformance checking 방식인 alignment based Confromance Checking을 하는 방법에 대해 알아보겠다.

*alignment based conformance checking이 무엇인지 모른다면 다음 포스팅을 참고하도록 한다.

2019/08/29 - [Process Mining - Theory/Conformance Checking] - Alignment based conformance checking이란? (Alignment를 이용한 적합도 검증) - 기본편

Alignment based conformance checking이란? (Alignment를 이용한 적합도 검증) - 기본편

Conformane Checking (적합도 검증)이란, 프로세스 모델이 실제 이벤트 로그에 얼마나 잘 부합하는지, 즉 프로세스 모델이 실제 데이터와 비교해봤을 때 얼마나 잘 맞는지를 검증하는 것을 말한다. 이번 포스팅에..

process-mining.tistory.com

* PM4Py 공식 문서 alignment 부분에 다음과 같은 말이 써 있다.

즉, cvxopt 라이브러리가 현재 사용되고 있는 Google ORTools 라이브러리보다 훨씬 빠르므로 빠른 alignment 계산이 필요한 사람들은 cvxopt를 직접 설치하고 import하여 사용하라는 말이다. 빠른 연산이 필요한 사람들은

pip install pm4pycvxopt

위 코드를 이용하여 라이브러리를 다운로드하고 이를 import하여 사용하면 된다.

1. 우선, 분석할 XES 파일을 import하여 이를 log에 저장하고, Conformance Checking을 원하는 모델을 net에 저장해준다. 필자는 inductive miner를 선택했다.

import os
from pm4py.objects.log.importer.xes import factory as xes_importer
from pm4py.algo.discovery.inductive import factory as inductive_miner

log = xes_importer.import_log(os.path.join("input_data","running-example.xes")) #join 안에 원하는 파일의 경로

net, initial_marking, final_marking = inductive_miner.apply(log) #원하는 모델을 선택한다.

2. 해당 log와 net에 대해서 alignment based replay를 적용시킨다.

import pm4py
from pm4py.algo.conformance.alignments import factory as align_factory

if __name__ == "__main__": #성능 향상을 위한 multiprocessing을 위한 조건문
	alignments = align_factory.apply_log(log, net, initial_marking, final_marking)

3. 결과값을 보기 위해 다음 코드를 입력해준다.

for i in alignments:
    print(i)

4. 다음과 같은 결과가 나오면 성공이다.

{'alignment': [('register request', 'register request'), ('>>', None), ('examine casually', 'examine casually'), ('check ticket', 'check ticket'), ('>>', None), ('decide', 'decide'), ('reinitiate request', 'reinitiate request'), ('>>', None), ('>>', None), ('examine thoroughly', 'examine thoroughly'), ('check ticket', 'check ticket'), ('>>', None), ('decide', 'decide'), ('>>', None), ('pay compensation', 'pay compensation')], 'cost': 6, 'visited_states': 17, 'queued_states': 45, 'traversed_arcs': 49, 'fitness': 1}
{'alignment': [('register request', 'register request'), ('>>', None), ('check ticket', 'check ticket'), ('examine casually', 'examine casually'), ('>>', None), ('decide', 'decide'), ('>>', None), ('pay compensation', 'pay compensation')], 'cost': 3, 'visited_states': 9, 'queued_states': 23, 'traversed_arcs': 26, 'fitness': 1}
{'alignment': [('register request', 'register request'), ('>>', None), ('examine thoroughly', 'examine thoroughly'), ('check ticket', 'check ticket'), ('>>', None), ('decide', 'decide'), ('>>', None), ('reject request', 'reject request')], 'cost': 3, 'visited_states': 9, 'queued_states': 23, 'traversed_arcs': 25, 'fitness': 1}
{'alignment': [('register request', 'register request'), ('>>', None), ('examine casually', 'examine casually'), ('check ticket', 'check ticket'), ('>>', None), ('decide', 'decide'), ('>>', None), ('pay compensation', 'pay compensation')], 'cost': 3, 'visited_states': 9, 'queued_states': 23, 'traversed_arcs': 25, 'fitness': 1}
{'alignment': [('register request', 'register request'), ('>>', None), ('examine casually', 'examine casually'), ('check ticket', 'check ticket'), ('>>', None), ('decide', 'decide'), ('reinitiate request', 'reinitiate request'), ('>>', None), ('>>', None), ('check ticket', 'check ticket'), ('examine casually', 'examine casually'), ('>>', None), ('decide', 'decide'), ('reinitiate request', 'reinitiate request'), ('>>', None), ('>>', None), ('examine casually', 'examine casually'), ('check ticket', 'check ticket'), ('>>', None), ('decide', 'decide'), ('>>', None), ('reject request', 'reject request')], 'cost': 9, 'visited_states': 25, 'queued_states': 67, 'traversed_arcs': 74, 'fitness': 1}
{'alignment': [('register request', 'register request'), ('>>', None), ('check ticket', 'check ticket'), ('examine thoroughly', 'examine thoroughly'), ('>>', None), ('decide', 'decide'), ('>>', None), ('reject request', 'reject request')], 'cost': 3, 'visited_states': 9, 'queued_states': 23, 'traversed_arcs': 26, 'fitness': 1}

이 결과값이 뜻하는 것은 다음과 같다.

alignment: 계산한 alignment를 뜻한다. ()안이 같은 액티비티로 되어 있으면 sync move, (>>, activity)의 형태면 model move, (activity, >>)의 형태면 log move, (>>, None)의 형태면 silent move이다.
cost: 계산한 alignment의 cost를 말한다.
fitness: 해당 trace에 대한 alignment based fitness 값을 말한다.

5. 각 trace에 대한 fitness 값이 아닌 전체의 fitness 값을 보고 싶으면 다음 코드를 입력한다.

from pm4py.evaluation.replay_fitness import factory as replay_fitness_factory

log_fitness = replay_fitness_factory.evaluate(alignments, variant="alignments")
print(log_fitness)

6. 다음과 같은 결과값이 나오면 성공이다.

{'percFitTraces': 100.0, 'averageFitness': 1.0}

각 결과값은 다음과 같은 의미를 가지고 있다.

percFitTraces: 모델에 대해 perfectly fitting하는 trace의 비율
averageFitness: 전체 이벤트 로그의 평균 fitness 값

이번 포스팅에서는 pm4py를 이용하여 alignment based replay conformance checking을 하는 법에 대해 알아보았다. 이를 통해 코드 몇 줄로 프로세스 모델의 성능을 파악함으로써 가장 좋은 프로세스 모델을 찾는 것이 더 쉬워질 것이다.

저작자표시 비영리 변경금지

PM4Py로 alignment based conformance checking하는 법

전체 카테고리

블로그 인기글

전체 방문자

티스토리툴바

전체 카테고리

블로그 인기글

최근 글

최근댓글

전체 방문자

티스토리툴바